Building a Simple Search Engine with PHP
Pages: 1, 2, 3
The Search Interface
Of course, users will not be able to work with the MySQL database directly.
Therefore, we'll create another PHP script that provides an HTML form to query
the database. This works just like any other search engine. The user enters a
word in a textbox, hits Enter, and receives a page of results linked to the
appropriate pages. The result order depends on the number of times a keyword
appears in each document. The search.php script is listed
below.
<?
/*
* search.php
*
* Script for searching a database populated with keywords by the
* populate.php-script.
*/
print "<html><head><title>My Search Engine</title></head><body>\n";
if( $_POST['keyword'] )
{
/* Connect to the database: */
mysql_pconnect("localhost","root","secret")
or die("ERROR: Could not connect to database!");
mysql_select_db("test");
/* Get timestamp before executing the query: */
$start_time = getmicrotime();
/* Set $keyword and $results, and use addslashes() to
* minimize the risk of executing unwanted SQL commands: */
$keyword = addslashes( $_POST['keyword'] );
$results = addslashes( $_POST['results'] );
/* Execute the query that performs the actual search in the DB: */
$result = mysql_query(" SELECT p.page_url AS url,
COUNT(*) AS occurrences
FROM page p, word w, occurrence o
WHERE p.page_id = o.page_id AND
w.word_id = o.word_id AND
w.word_word = \"$keyword\"
GROUP BY p.page_id
ORDER BY occurrences DESC
LIMIT $results" );
/* Get timestamp when the query is finished: */
$end_time = getmicrotime();
/* Present the search-results: */
print "<h2>Search results for '".$_POST['keyword']."':</h2>\n";
for( $i = 1; $row = mysql_fetch_array($result); $i++ )
{
print "$i. <a href='".$row['url']."'>".$row['url']."</a>\n";
print "(occurrences: ".$row['occurrences'].")<br><br>\n";
}
/* Present how long it took the execute the query: */
print "query executed in ".(substr($end_time-$start_time,0,5))." seconds.";
}
else
{
/* If no keyword is defined, present the search page instead: */
print "<form method='post'> Keyword:
<input type='text' size='20' name='keyword'>\n";
print "Results: <select name='results'><option value='5'>5</option>\n";
print "<option value='10'>10</option><option value='15'>15</option>\n";
print "<option value='20'>20</option></select>\n";
print "<input type='submit' value='Search'></form>\n";
}
print "</body></html>\n";
/* Simple function for retrieving the current timestamp in microseconds: */
function getmicrotime()
{
list($usec, $sec) = explode(" ",microtime());
return ((float)$usec + (float)$sec);
}
?>
The script may be called with or without the keyword argument.
If it's defined, the script searches for that word in the database. It will
also show the length of time it took to process the query. Otherwise, the
script presents the search page instead. That page will resemble Figure 1.

Figure 1 - our simple search page
Let's search on the keyword linux. Our dataset produces results similar to Figure 2.

Figure 2 - the search results page
As expected, onlamp.com appears first on the result page because the keyword linux appears more frequently on this site than on the others. A search for java would probably get onjava.com on the top, and 'xml' would most likely generate the most hits for xml.com. Also note that we've limited the results to the five most interesting pages.
Speeding Up the Database
As the bottom of the results page shows, the query took 0.393 seconds to execute. While this may not seem like an incredibly long time, it does represent quite a hit as the database grows. Fortunately, since we're using a database, there's a very simple solution.
CREATE INDEX word_word_ix ON word (word_word);
This will create an index in the word table on the
word_word column. Since all of our searches start with this
column, the database will find the appropriate pages much more quickly. To
prove this point, we will search for the keyword linux again, to see if we
gained any performance. See Figure 3.

Figure 3 - searching with an index
Nice. It took 0.028 seconds, a speed increase of 0.365 seconds, or 1,400 percent. If this engine handled an average of 1,000 queries per hour, this would mean a savings of about 144 minutes per day.
Summary
As shown in this article, useful search engines can be built pretty simply. Without much hassle, you could develop this concept further to handle multiple keywords, boolean operators, stop words, and other features you find in many commercial search facilities. It would also be interesting to populate the database further with a few hundred megs of data. Would the speed still be reasonable? Probably. One thing we could be absolutely sure of, however, is that for an intranet of a mid-sized company with just a few dozen searches per hour, this solution can offer stunning performance with minimal setup.
Whether you're planning to develop a big-scale commercial search engine, or
are just playing around, http://www.robotstxt.org/wc/robots.html
offers lots of helpful and interesting reading on this topic. For example, it
describes the use of the standardized robots.txt file, which every
Internet spider should use to determine what it can and can't do on a specific
site. Please read and follow the rules if you don't control the sites you
want to search.
I wish you good luck and look forward to getting a visit from your spider soon. :)
Daniel Solin is a freelance writer and Linux consultant whose specialty is GUI programming. His first book, SAMS Teach Yourself Qt Programming in 24 hours, was published in May, 2000.
Return to the PHP DevCenter.
Showing messages 1 through 60 of 60.
-
Bugfixes, Better Error Checking and Full Project Download
2009-01-10 06:56:33 ZIMSICAL.com [Reply | View]
-
auto index lower level pages
2008-05-02 15:39:37 lektrikpuke [Reply | View]
It's nice that it does one page, but what about a site (all lower levels)?
-
DEFINE A URL
2007-12-30 11:44:57 Arsench [Reply | View]
Hello Im a new in php and need a help please.Im puting the code and receiving the error You need to define a URL to process.Please can you give a example where I have to define a url,and what kind of url?
Thanks
-
IIS problems
2007-10-16 06:49:35 snoski3 [Reply | View]
I guess this is supposed to work from what most posts have been saying. However, the populate and search php files don't work for me. The populate php doesn't give me an error yet there are no new entries in the database. The search page looks like this:
\n"; for( $i = 1; $row = mysql_fetch_array($result); $i++ ) { print "$i. ".$row['url']."\n"; print "(occurrences: ".$row['occurrences'].")
\n"; } /* Present how long it took the execute the query: */ print "query executed in ".(substr($end_time-$start_time,0,5))." seconds."; } else { /* If no keyword is defined, present the search page instead: */ print "
Keyword: \n"; print "Results: 5\n"; print "1015\n"; print "20\n"; print "
\n"; } print "\n"; /* Simple function for retrieving the current timestamp in microseconds: */ function getmicrotime() { list($usec, $sec) = explode(" ",microtime()); return ((float)$usec + (float)$sec); } ?>
Thank you for your help.
-
Help
2007-09-14 10:46:30 Chuddy [Reply | View]
I just created my database and copy these two codes to two different notepad and save it inside my php server.
But when i run search.php, the following error massage will appear " Notice: Undefined index: keyword in c:\program files\easyphp1-8\www\search.php on line 13". When i check my line 13 of the code it is " if( $_POST['keyword'] )". How do i fix this problem?
-
Am i doing something wrong
2007-09-02 10:42:05 louiscbrooks [Reply | View]
Hi, i tryed this tutorial and everything seems to be running smoothly. Theres one problem though the populate.php script seems to index words 2 or 3 times each. is this normal or have i misconfigured something.Many Thanks Louis
-
MULTIPLE KEYWORD SEARCH
2007-07-13 09:46:17 -MJD- [Reply | View]
Hi,
Great tutorial, converted it to ASP, works great.
Now need a little guidance on multiple keywords!
Any advice on the SQL statement?
Cheers.
MJD
-
Help - please!
2007-06-02 12:20:19 88guy [Reply | View]
I have spent days on this thing and I have one, enormous problem. By the way, I'm 55 and fairly new to all of this (Php and Mysql). On my linux server I easily created the database and added the tables. However, when I try to run the populate.php script in a browser I get a parsing error. I have added "populate.phprl=http://www.cnn.com/" at the bottom of the script, as advised by a previous poster. However, with little knowledge of php I am assuming that I am adding it minus brackets, in the wrong place, etc -something is not right. If I do not add the previous line I simply get the message about needing to add a URL for indexing.
Very specifically - please - where does a line like this go and what, precisely, is it's syntax (should it be proceeded by } and followed by {, etc.)?
-
Natural Language search engine
2007-06-01 12:50:42 tennis_dunlop [Reply | View]
Hi, this Tut was great. But as many as mentioned so far, there are quite a bit of extra thing that can be done. One of the main issue I have with this example, is that it only search for one word, and without true relationship without each otehr, except the amount of time the word is present in the text.
I would like to suggest a php search engine that I wrote (it's GNU licensed) that include full on-screen installation, ability to create different index zones on your website (perfect for multilingual sites), natural language search and query expansion using MySQL 5.0+, great on-screen stats and report about your user's searches, aggregate userS' search and suggest complementary terms of search to ease your user's experience, and much more. Have a look at http://blog.dstmichel.ca/index.php/2007/05/14/11-invenio-a-php-web-search-engine
And feel free to play and learn with all the included well-comented files.
I would of course greatly appreciate your comments and would gladly help if you need assistance.
Thanks,
Dennis
-
populate.php
2007-03-06 14:45:44 adevesa [Reply | View]
I'm new to all of this PHP/MYSQL. I've been trying for days now to do this. My problems are:
1.How do I make the mysql server to run on the localhost.
2.when I index http://localhost/populate.php?url=http://www.macdevcenter.com/ I get exactly the same answer in the PAGE table, instead of only http://www.macdevcenter.com/
3.How does the populate do all the things it is supposed to do regarding reading the database?
I NEED HELP!!!!! I've been working hard to create a search engine with knowing nothing about computer programming until a few days ago.
Thank you.
-
modify
2007-02-28 15:00:04 katie_P [Reply | View]
Hi as mentioned already im new to php & mysql, I must say its quite interesting - far better then my media studies course i used to do lol
1. I was wondering is there any way in tailoring the search engine so that it displays a description under the associated links ?
2. Lastly, am I write in saying that you would input a else statement to show "please enter a search term" when there is no valid search word inside the box
Thanks for your help guys
-
Simplifying the query
2007-01-25 15:45:35 pjdevitt [Reply | View]
This seems obvious to me, but why are you storing every occurance of a keyword within a document? A simpler solution would be to just count the number occurances of a word within the PHP code and write the value to the database. That would remove the GROUP BY used in most of your queries. The occurance table would need to be modified to include a 'count' field. Here's a snippet of PHP that will create an array of words and the number of times they occur in the document.
$wordbank = array();
preg_match_all("/(\b[\w+]+\b)/", $buf, $words);
for($j=0; $j<count($words[0]); $j++) {
$cur_word = addslashes(strtolower($words[0][$j]));
if(!in_array($cur_word, $filterWords)) {
if(!isset($wordbank[$cur_word])) {
$wordbank[$cur_word] = 0;
}
$wordbank[$cur_word]++;
}
}
</code>
You can then iterate through the $wordbank array and add a new record in the occurance table.
-
Giving Back
2007-01-23 16:52:24 PHPchick [Reply | View]
I found this really helpful in getting up & running fast and wanted to give back as a "thanks". So here's a little bit that I added in my version:
/* create an array of words you want to exclude */
$filterWords = array('a', 'about', 'an', 'and', 'are', 'as', 'at', 'be', 'by', 'from', 'how', 'i', 'in', 'is', 'it', 'nbsp', 'of', 'on', 'or', 'that', 'the', 'this', 'to', 'was', 'we', 'what', 'when', 'where', 'which', 'with');
... later in the code ...
/* Does the current word already have a record in the word-table? */
$cur_word = addslashes( strtolower($words[$i][$j]) );
/* add the following to filter unwanted words */
if (!in_array( $cur_word, $filterWords)) {
... database selects/inserts...
} -
Giving Back
2007-03-05 03:59:33 katie_P [Reply | View]
Hi sorry to bother you, but do you know how to search for more than 1 word. There was code provided earlier on in this thread but it does'nt work for me
$result = mysql_query(" SELECT p.page_url AS url,
COUNT(*) AS occurrences
FROM page p, word w1, occurrence o1, word w2, occurrence o2
WHERE p.page_id = o1.page_id AND
w1.word_id = o1.word_id AND
w1.word_word = \"$keyword[1]\" AND
w2.word_id = o2.word_id AND
w2.word_word = \"$keyword[2]\" AND
GROUP BY p.page_id
ORDER BY occurrences DESC
LIMIT $results" );
I think you need to change an SQL statment, could you please help me
Thanks
-
TSEP - ready, well featured PHP search engine
2007-01-20 23:48:53 ONG [Reply | View]
Hi
I was happy to read the article, but it's one of many: There are several articles out on the net talking about search engine development.
Since I am the admin of a search engine on sourceforge (TSEP) ( http://www.tsep.info ) I want to take the chance to invite developers to join in on an advanced search engine development progress: We are looking to dedicated developers right now.
Olaf
-
Fantastic Article
2006-08-25 06:44:05 JAS168 [Reply | View]
This was a great article. I used the same concept to create a search engine for all types of resources. My only suggestion would be to keep in mind that this article is not the end-all solution for searching on a website, but rather a place to start in coding.
-
Passing session id through fopen
2005-03-20 22:42:59 TennisOne [Reply | View]
Thanks for the article. I am attempting to use it for our website. The populate.php script which executes the fopen passes in the URL to an article on our website. Every article on our website ensures that the user accessing the article is a member. I am executing the populate.php script as a member, however, when I execute the fopen call I lose all of my session information and get redirected to a join page because the article page does not think that I am a member.
Is there a way for me to pass session information via the URL even though
session.use_trans_sid = 0
in my php.ini file. Which I believe from a security standpoint is the right thing to do.
I tried passing my session information by defining
$url_with_sid = $url."?PHPSESSID=".session_id()
if (!($fd = fopen($url_with_sid, "r"))
Unfortunately the fopen fails with "failed to open stream: HTTP request failed!"
I can define the url with a query string parameter such as
$url_with_sid = $url."?hello=world";
And this works fine. Consequently, the fopen is not allowing the PHPSESSID query string parameter to be passed. Any thoughts or ideas would be greatly appreciated.
-
another success
2005-02-15 14:51:33 Lykerus [Reply | View]
For those that were having trouble with the
Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource
Try something like this code:
$MySQLPassword = "password";
$HostName = "localhost";
$UserName = "username";
mysql_connect($HostName,$UserName,$MySQLPassword)
or die("ERROR: Could not connect to database!");
mysql_select_db("database_name");
I haven't worked on this engine for a few weeks, but decided to rewrite the connection script and see if I could get things working. It now works great!
Thanks for this tutorial! It is a great one!
-
another success
2005-02-15 14:50:47 Lykerus [Reply | View]
For those that were having trouble with the
Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource
Try something like this code:
$MySQLPassword = "password";
$HostName = "localhost";
$UserName = "username";
mysql_connect($HostName,$UserName,$MySQLPassword)
or die("ERROR: Could not connect to database!");
mysql_select_db("database_name");
I haven't worked on this engine for a few weeks, but decided to rewrite the connection script and see if I could get things working. It now works great!
Thanks for this tutorial! It is a great one!
-
success
2005-01-26 17:18:07 peetycox [Reply | View]
Hi
Please ignor my previous post, it works a treat(although a little slower n my server).
I have one question, the example uses localhost and root with password which i understand is need for populating the databese. However for the search could i use localhost/any with limited permissions. Or am i just being paraniod?
Thanks great example.
Peetycox
-
mysql_fetch_array() error
2005-01-21 14:43:31 Lykerus [Reply | View]
Hello,
I am having a strange problem when trying to populate the db. I am new to db's so be patient with me. I noticed that Sean had this same problem earlier, but I couldn't find the solution he found.
The error I get is:
mysql_fetch_array(): supplied argument is not a valid MySQL result resource
I tried a roundabout way of using this ($row = mysql_fetch_array($result);) tag, but it doesn't work. When I try and search, it seems as though the database hasn't been populated.
Can anyone help? Please and thank you! -
mysql_fetch_array() error
2005-01-26 15:54:55 peetycox [Reply | View]
Hi
I'am getting the same error it reads:
Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in d:\appserv\www\scn\search.php on line 30
Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in d:\appserv\www\scn\search.php on line 70
Indexing: social
Its driving me mad, this thing is great and just what i need. Please some tell what i need to do
Thanks
Peetycox
-
help me for populate.php
2004-12-03 00:36:42 setiawan77th [Reply | View]
Hi all..
please help me with my problem, when I run
populate.php it always say "You need to define a
URL to process." but actually I alreade put
some url addres in $url = addslashes( $_GET['http://localhost/newweb/start/'] ); well some body can help me?? -
help me for populate.php
2005-01-21 14:59:09 Lykerus [Reply | View]
One other thing that I forgot to add to your question is that you shouldn't put you url address after the '$_GET' tag. The url will not be able to be processed if you keep that in there. If you change the 'http://localhost/newweb/start/' to 'url' like it was before, and then do as I stated in my last message, you should be up and running.
Good luck. -
help me for populate.php
2005-01-21 14:35:57 Lykerus [Reply | View]
If you haven't already been answered to this or figured this out yet, in order to process a URL, you need to add the url at the end of the populate.php file.
e.g. ...populate.php?url=http://www.cnn.com/
This basically tells the populate.php file which url to index.
Hope this helps!
-
help about the search engine
2004-03-03 02:00:56 dinesh2037 [Reply | View]
hi,
I need 2 add (a)no. of hits in the results, (b) add previous and next link and (c) display message when no parameter is supplied.
i need the search engine in php but i am quite new 2 php.
plz write 2 me that whoever could help me.
bye
-
Search 0-40 words
2004-01-01 18:39:58 anonymous2 [Reply | View]
Hi all
I am still having problems with the search for keywords that can search up to 40 words. Thank you Giff for editting the SQL statement, but that would only solve my problem for 2-worded searches.
My trail of thought is that the php script would have to break the string of words into substrings, then parse each substring as an individual word search, then searching for more than one searched word in the same article.
The more i play with this, the bigger mess it becomes. Can anyone help?
-Sean
-
Add on for foreign language
2003-12-31 00:35:06 anonymous2 [Reply | View]
Hi All,
Thanks for this useful search engine.
The accents ("à" in french for exemple) are encoded as "à" by html editors.
To get your search engine dealing with it, I put the following lines in populate.php :
/* Foreign site : convert french characters made by html editors : */
$patterns[0] = "/ /";
$patterns[1] = "/à/";
$patterns[2] = "/â/";
$patterns[3] = "/é/";
$patterns[4] = "/è/";
$patterns[5] = "/ê/";
$patterns[6] = "/î/";
$patterns[7] = "/ù/";
$patterns[8] = "/û/";
$patterns[9] = "/ç/";
$patterns[10] = "//";
$patterns[11] = "/€/";
$patterns[12] = "/©/";
$replacements[0] = " ";
$replacements[1] = "à";
$replacements[2] = "â";
$replacements[3] = "é";
$replacements[4] = "è";
$replacements[5] = "ê";
$replacements[6] = "î";
$replacements[7] = "ù";
$replacements[8] = "û";
$replacements[9] = "ç";
$replacements[10] = "œ";
$replacements[11] = "€";
$replacements[12] = "©";
$buf = preg_replace($patterns, $replacements, $buf);
BETWEEN LINE
$buf = ereg_replace('/&\w;/', '', $buf);
AND LINE
/* Extract all words matching the regexp from the current line: */
It's not big deal but it works and it is easy to adapt to foreign languages.
Regards,
Louis
http://www.interactive-trails.com
-
A more detailed query...
2003-12-23 06:52:17 anonymous2 [Reply | View]
Hi Daniel,
Thanks again for your fantastic tutorial. I was wondering how one might submit a query which exclueds certain words and includes others. I know that the following should work for finding several words:
$result = mysql_query(" SELECT p.page_url AS url,
COUNT(*) AS occurrences
FROM page p, word w1, occurrence o1, word w2, occurrence o2
WHERE p.page_id = o1.page_id AND
w1.word_id = o1.word_id AND
w1.word_word = \"$keyword[1]\" AND
w2.word_id = o2.word_id AND
w2.word_word = \"$keyword[2]\" AND
GROUP BY p.page_id
ORDER BY occurrences DESC
LIMIT $results" );
But how can I also tell it to NOT provide me with an article that contains a word I don't want as well?
- Giff -
A more detailed query...
2003-12-23 22:43:15 anonymous2 [Reply | View]
Hi Giff
Thank you for that. However, there is still a minor defect. This is what came up when i pasted your code in. And keep in mind that i have not altered anything stated in the article.
"Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /var/www/html/jeo/dev/search/search.php on line 48
query executed in 0.000 seconds."
So does anyone have a suggestion?
Please help!
Thank you all for helping me so far.
Best wishes for the festive season!
Regards,
Sean -
RE: A more detailed query...
2003-12-23 07:15:55 dsolin [Reply | View]
Hi Giff,
I'm not sure if I got you right here, but couldn't you just use the != or NOT LIKE operators? This would result in something like:
w2.word_word != \"$keyword[1]\"
or:
w2.word_word NOT LIKE \"$keyword[1]\"
Hope that helps!
-Daniel -
RE: A more detailed query...
2003-12-23 07:44:06 anonymous2 [Reply | View]
I tried that earlier, but it doesn't seem to work - the reason being (I think) that there ARE occurances of that word that are defined as not being in that article (aka, all the occurances in other articles). -
RE: A more detailed query...
2003-12-23 08:12:37 anonymous2 [Reply | View]
Let me try to explain a little more clearly.
Say I have an article with the word "science" and the word "computer" in it, and the search asked for all articles with "science" but without "computer."
If I use this query:
w2.word_word != "computer"
than it will look for an occurance o2 that dosn't point to "computer". However, this constraint would be satisfied instantly by any other word existing in the article - even the word "science" would satisfy it. As long as there is one instance of a word other than computer in that article, the search will bring it up.
If I put the "!= " further up with the article definition, we still get a similar problem:
p.page_id != o1.page_id
This occurrence o1 of "computer" may indeed not be in the page, simply because it is an occurrence of "computer" on another page. As long as there is any other page with "computer" on it, the query will bring up the page that the search dosn't work.
What I really want is not:
"Find a page where there EXISTS a word "computer" not on that page"
but:
"Find a page that, for ALL occurrences of the word "computer", NONE of them are on it"
-
multiple keywords
2003-12-22 01:39:29 anonymous2 [Reply | View]
hi
thank you for those codes. they worked like a charm.
i was just wondering if there was additional syntax involved whereby i would be able to search more than 1 keyword at a time. For example, a phrase?
I have tried it on this search engine and it comes back with a null result.
Your help would be most welcomed.
Thank you,
Sean -
RE: multiple keywords
2003-12-22 01:57:16 dsolin [Reply | View]
Hi Sean,
As pointed out by a previous poster, the easiest way to implement multiple keyword searching would probably be to use MySQL's Full-text Search. Detailed information about this feature can be found at:
http://www.mysql.com/doc/en/Fulltext_Search.html
You will need to rewrite the example program used in the article for it to use Full-text Search, but that should be quite simple to do.
Good luck!
Daniel -
RE: multiple keywords
2003-12-22 17:59:54 anonymous2 [Reply | View]
Thank you Daniel.
I have been looking at your link above, but have no luck in seeing this through. First of all, I still do not understand the FULLTEXT context. And how I would alter the table to incorporate such feature, then bringing it to the Search.php section.
I hope you can help me here.
Best regards,
Sean -
RE: multiple keywords
2004-09-21 10:57:36 cityslicker [Reply | View]
Hi All,
Thanks for the code - it is great.
Just inform you of my edits and how I used multiple keywords.
1st of all - I put the keyword extraction script in a function.
I would use it to list titles and keywords of pages as well as main body text. I call the function 3 times for words in a title, twice for keywords and once for main body text. This is a way of scoring a page.
E.g. when a search is done on 'business' and a page title has the word 'business' in it will show 3 occurrences (although I dont show occurrences I just use the score to order the list)
2. Multiple keywords.
Basically, I use the explode() function to get an array or keywords and loop through them applying them to the query. I keep the scores for each word and add them together before displaying the list by highest score.
//CODE
/* Get timestamp before executing the query: */
$start_time = getmicrotime();
$keyword_array = explode(" ", $_GET['keyword']);
$score = array();
foreach($keyword_array as $keyword)
{
/* Set $keyword and $results, and use addslashes() to
* minimize the risk of executing unwanted SQL commands: */
/* Execute the query that performs the actual search in the DB: */
$result = mysql_query(" SELECT p.page_title AS title,
COUNT(*) AS occurrences
FROM pages p, word w, occurrence o
WHERE p.pageID = o.page_id AND
w.word_id = o.word_id AND
w.word_word = \"$keyword\"
GROUP BY p.pageID
ORDER BY occurrences DESC
LIMIT 0, 5" );
for( $i = 1; $row = mysql_fetch_array($result); $i++ )
{
$score[$row['title']] += $row['occurrences']; //Array of scores
}
}
if(count($score) > 0)
{
arsort($score); //Reverse sort the associative array scores by highest
/* Get timestamp when the query is finished: */
$end_time = getmicrotime();
/* Present the search-results: */
print "<h2>Search results for '".$_GET['keyword']."':</h2>\n";
//Loop through array and display results
while ($element = each($score)) //Loop through array and output results
{
echo $element[ "key" ];
echo " - ";
echo $element[ "value"];
echo "
";
}
/* Present how long it took the execute the query: */
print "query executed in ".(substr($end_time-$start_time,0,5))." seconds.";
}
else
{
//Display a no pages found page
}
// END CODE
This works fine but is a little slower if the user wants to search for a sentence. All in all, it is an easy add-on to the already supplied code that provides multiple keyword searching.
Hope this helps someone! -
RE: multiple keywords EDIT
2004-09-21 11:10:36 cityslicker [Reply | View]
Hi again,
Just a small edit from the code above.
If a user seached for "good web sites" and a page contained 100's of 'good' but no 'web' and 'sites' then it would rank higher than a page which can have all three. This is not what we want so ammend the above code with this part:
for( $i = 1; $row = mysql_fetch_array($result); $i++ )
{
$score[$row['title']] += $row['occurrences']; //Array of scores
if($row['occurences'] > 0) { $score[$row['title']] += 1000; } //This makes pages containing all keywords rank highest
}
You can set the 1000 to whatever you like but you should be safe with that number.
-
RE: multiple keywords EDIT
2004-09-21 11:19:15 cityslicker [Reply | View]
Hi once more!!
Make sure you spell occurrences correctly unlike in my code above!!
-
RE: multiple keywords EDIT
2009-11-04 09:45:30 xoqqa [Reply | View]
Would you be so kind to send me your version of this search engine please? I've been trying to figure out what is wrong with mine and noticed that yours is somewhat different. For example I don't have page_title but just urls... I think you've also modified the populate script.
I would appreciate if you can send me the script files on sammutmatu[at]gmail.com
Thanks
-
A more difficult search engine
2003-12-18 08:16:26 anonymous2 [Reply | View]
I need to develop a PHP/MySQL search engine for my compnay (the publisher of an academic magazine) and my boss has expressed a desire for an "exact phrase option." For instance, when a user types in "martin luther king", the search engine would only bring up articles on Dr. King, rather than articles about Martin Luther and his dealings with german royalty. However, this seems like a terribly difficult thing to implement. Your tutorial has been a great deal of help on this topic, and I was wondering if you might point me in the right direction.
- Giff -
RE: A more difficult search engine
2003-12-19 00:49:31 dsolin [Reply | View]
Hi Giff,
From what I can understand by reading your description, this is something that you will need to implement in the indexing mechanism of your search engine -- when the user provides the engine with a search phrase, the backend needs to already know the difference between "an article on Dr. King" and "an article about Martin Luther and his dealings with german royalty".
So, without being able to get into much detail, I think you need to implement this in your backend database. Maybe you should add a column that indicates the state of a certain URL -- is it an article on Dr. King or a about his dealings with german royalty? Of course, the hardest part of such a project would be to implement logic into the indexing mechanism that calculates that value of this column. Maybe it needs to be done manually?
If you find a working solution, Giff, please feel free to post it here. I'm sure that would be interesting reading for many of us. Good luck!
-Daniel
-
Search in dynamic page
2003-10-20 03:29:22 anonymous2 [Reply | View]
As i saw this search method do not search in dynamic page such as mypage.php?id=46 ro something ...
How can i do that ? run populate.php with a loop for all params in URL ? :
for ($i=1; $i < 50; $i++) {
system("populate.php?id=".$i);
}
...
Matt -
RE: Search in dynamic page
2003-10-20 04:22:54 dsolin [Reply | View]
Hi Matt,
With this (simple) example, all pages needs to be indexed via http individually. However, just as you imply, you could quite easily automate this task to index several pages in a batch. As you might know, you can make http-requests using PHP's fopen()-function, so you could do something like this:
for($i = 1; $i < 50; $i++)
{
fopen("http://www.mysite.com/mypage.php?id=".$i);
}
Good luck!
Daniel
-
it doesn't work?
2003-09-20 16:31:11 anonymous2 [Reply | View]
i'm a total php noob, but i uploaded the 2 pages to my webhost (lycos) but it won't print the search results, i adjusted the files to my database and so on it prints everything but the results...?
-
Nice
2003-09-04 12:12:14 anonymous2 [Reply | View]
Great solution! I tried it out tonight and got it working like a charm in half the time I thought it would take me!
Keep up the good work!
-
Index through the file system
2002-11-04 03:48:05 anonymous2 [Reply | View]
For building a local index, it is much more
efficent to do it through the filesystem instead
of through the http server.
-
Use Exclusion Tags
2002-11-04 03:27:27 anonymous2 [Reply | View]
Many search engine recognize tags that instruct them NOT to parse the markup that they enclose.
For example ...
<!-- stop_indexing -->
Here there might be a menu or other markup
that should not be indexed
<!-- start_indexing -->
It ensures that the only words derived from
significant content are indexed ... makes it
more precise for the user, and of course the
index is smaller -- so the whole thing works
faster.
-
htDig
2002-10-31 13:21:44 anonymous2 [Reply | View]
Stand on some giant's shoulders -
www.htdig.org - an open-sourced spider/search engine used extensively throughout the world.
-
Optimizing multiple occurences of same word on same page
2002-10-30 08:07:18 anonymous2 [Reply | View]
Instead of inserting several records for the same word on the same page, you could add another field to the occurence table which indicates the number of occurences of the word on the referenced page. This reduces the number of records per page to the number of distinct words. Also stop words (excluding some words from the index is always good ["a","the","and"]).
-
simple search
2002-10-28 23:33:02 anonymous2 [Reply | View]
Nice solution, I did the same thing in Perl, but used the same approach.
Only I just dump all the data en do a refill each night, instead of checking if there's a record.
-
MySQL Fulltext?
2002-10-28 00:01:55 anonymous2 [Reply | View]
Hey,
Why not use the MySQl Fulltext?
This gives you support for boolean operations and stopwords.
It might be a good idea.
-- Brian -
MySQL Fulltext?
2003-05-21 12:52:26 anonymous2 [Reply | View]
Great idea, if your ISP supports MySQL 3.23....
>Hey,
>Why not use the MySQl Fulltext?
>This gives you support for boolean operations >and stopwords.
>It might be a good idea.
-- Brian
-
MySQL Fulltext?
2008-05-13 23:07:27 manrah [Reply | View]
i have a problem on search.
suppose i do a search on name, age ,sex which i collect through post on search page.on basis of these parameter i get 10 search result pages of 10 profile per page.but when I navigate to link that of next or previous result page I lost the parameters i.e name , age, sex.how to overcome this problem.



http://www.zimsical.com/portfolio/php/oreilly-search-engine-tutorial-fixed.zip