Update

So, I started poking around with creating a Stored Procedure this morning and I finally found the “you’re missing something insanely obvious” part of the equation.  All I needed to do to mix the aggregate function with a normal select was use the GROUP BY statement (which had the added bonus of also taking care of the need to use SELECT DISTINCT).

Long story short, I just reduced all of that code I posted yesterday to this.

<cfquery name=”getHitData” datasource=”#request.datasource#”>
    SELECT search_log.keywords, count(search_log.keywords) AS hitCount
    FROM search_log
    WHERE 1 = 1
    <cfif isDefined(‘attributes.startDate’) AND trim(attributes.startDate) is not ”>
        AND search_log.queryDate >= #createODBCDate(attributes.startDate)#
    </cfif>
    <cfif isDefined(‘attributes.endDate’) AND trim(attributes.endDate) is not ”>
        AND search_log.queryDate <= #createODBCDate(attributes.endDate)#
    </cfif>   
    GROUP BY search_log.keywords
    ORDER BY hitCount DESC
</cfquery>

And the page went from taking about 30 seconds to run to instantly loading.

Yay.

Coding/SQL question…

Ok, folks….This one is going out to all of you who work in SQL, cold fusion, etc…

I am currently working on series of reports based on keyword search data that we have archived from our main web site.  The page I’m working on is supposed to display the most popular keywords in descending order.  My initial thought was that I’d be able to get the data with one SQL query like so…

SELECT Count(search_log.keyword) AS hitCount, search_log.keyword
FROM search_log
WHERE search_log.keyword IN (SELECT DISTINCT search_log.keyword FROM search_log)

The problem is that once I throw COUNT in the mix SQL doesn’t seem to like returning any additional data from the table, so I had to break it down and find the results with cold fusion.  Doing so requires me to run a huge series of queries to get all of my data and the page takes forever to load.  Here’s what I’m doing…

<cfquery name=”getLog” datasource=”#request.datasource#”>
    SELECT DISTINCT (search_log.keywords)
    FROM search_log
</cfquery>

<cfset keywordCountArray = ArrayNew(1)>

<cfloop query=”getLog”>
    <cfquery name=”getCount” datasource=”#request.datasource#”>
        SELECT COUNT(search_log.keywords) AS hitCount
        FROM search_log
        WHERE search_log.keywords = ‘#getLog.keywords#’
    </cfquery>
    <cfset keywordCountArray[getLog.currentRow] = getCount.hitCount>
</cfloop>

<cfset newColumn = queryAddColumn(getLog, ‘hitCount’, ‘Integer’, keywordCountArray)>

<cfquery name=”getLogData” dbtype=”query”>
    SELECT getLog.keywords, getLog.hitCount
    FROM getLog
    ORDER BY getLog.hitCount DESC
</cfquery>

My thoughts on how to improve this…

I could just cache the “getLogData” query and not run the rest of them if the query exists, but I’m not sure if I can check for the existence of a cached query in ColdFusion.

Do the logic in a stored procedure.  Would speed things up slightly but, I think, still require me to run a multitude of queries.

Re-write the SQL in some terribly simplistic way that I’m just not seeing because I’m being blind.

Any help out there, gang?