Search Works Only With Single Words???

Permalink 1 user found helpful
Ok, I'll admit that I've never paid much attention to C5's search facility. I don't generally perform searches on my clients' sites. However, I just had a client contact me about why it doesn't seem to work, and it turns out they're right. It doesn't even show any pages containing the site name (which is multiple words)! In fact, it appears (based upon some tinkering around) that it works only with single word search terms. Is this by design? That seems woefully inadequate and inconsistent with the way nearly every other search engine works.

I'm hoping there's something I'm missing in terms of setup or configuration to get this thing to work with phrases. For instance, if the phrase "wood shop" appears on numerous pages throughout the site, searching for "wood shop" returns absolutely nothing! Say whuh??!!! Searching for "wood" OR "shop" however, does return results. Is there a simple way to get the C5 search engine to work as most people would expect?

Thanks much!

-Steve

Shotster
 
andrew replied on at Permalink Best Answer Reply
andrew
Yeah, this was an unfortunate bug that we've since fixed in github. Try downloading concrete/blocks/search/controller.php from here and replacing it in the clients site:

https://github.com/concrete5/concrete5/tree/master/web/concrete/bloc...

It was just a stupid regular expression bug.

Also, another thing of note brought up by the first part of your post: this search is all page name and block/content-based. If a content block or a page name or page description contains your client's company name, it'll be returned. But we don't spider or index the entire HTML of a page, meaning that if you have a company name in a header or logo text somewhere, it's obviously not going to be returned.
Shotster replied on at Permalink Reply
Shotster
Thanks Andrew, you pointed me in the right direction. There's something I don't understand though. Around line 149 of controller.php is the following...

if((empty($_REQUEST['query']) && $aksearch == false) || $this->resultsURL != '') {
   return false;      
}

I don't understand why false would be returned (and the search aborted) just because the user has specified a "Results Page" in the block configuration dialog. In other words, I don't understand the last comparison in the "if" statement.

I have my search block configured to post results to a different page, and the search just doesn't work unless I change that last comparison to "==" instead of "!=".

Any clarification would be appreciated.

Thanks,

-Steve
andrew replied on at Permalink Reply
andrew
I have no idea what that's doing. I think we can safely remove that resultsURL section.
alexaalto replied on at Permalink Reply
alexaalto
I also wanted to add that the search seems to be matching exact phrases only, however due to Google, people are fairly accustomed to being able to type in random words and getting documents returned that mention the words in random places throughout the document. As I couldn't see any settings to change this, I added some code in models/page_list.php replacing this line (#73):

$this->filter(false, "(psi.cName like $qk or psi.cDescription like $qk or psi.content like $qk {$attribsStr})");


...with this block of code:

$filter = '';
$words = explode(" ",$keywords);
for($i=0;$i<sizeof($words);$i++) {
   $qw = $db->quote('% '.$words[$i].' %');
   $filter .= "psi.cName like $qw or psi.cDescription like $qw or psi.content like $qw";
   if($i<(sizeof($words)-1)) $filter .= ' or ';
}
$filter = "($filter {$attribsStr})";
$this->filter(false, $filter);


Does anyone know whether this is protected against SQL injection?
andoro replied on at Permalink Reply
andoro
Dear andrew!

I'm from Hungary, we're using special latin2 characters: á é í ó ö ő ú ü ű

My experience is that c5 search engine doesn't return any results when typing these characters. I think it's an international problem.

Could you help me?

An example on one my c5 based site:http://ak-s.hu/
Try to seach for this: tartály
It will says: Nincs találat.... = No results....
Now try without "á": tartaly
Plenty of results :)

Do you have any idea?
Should I do something with indexing, or extra settings or is this a bug?
Thx!
Shotster replied on at Permalink Reply
Shotster
Could it possibly have anything to do with the DB character encoding used?

-Steve
andoro replied on at Permalink Reply
andoro
I'm using UTF-8 everywhere (db, files etc.) which includes all of the special hungarian characters.
Shotster replied on at Permalink Reply
Shotster
Have you tried replacing the search block controller file as Andrew suggested in both this thread and the following one?

http://www.concrete5.org/community/forums/customizing_c5/a-weird-se...

-Steve
msglueck replied on at Permalink Reply
msglueck
If it helps someone, here is my current search implementation.
Works with German Umlauts (ÜÖÄüöäß), replaces different Umlaut spelling (e.g. Ue->Ü),
works with multiple words (e.g. online clip) or connected words (e.g. online-clip).

<?php 
   if( !empty($_REQUEST['query']) || isset($_REQUEST['akID']))  {
      $q = $_REQUEST['query'];
      $regex = array(); // replace Umlaut spelling with real Umlauts
      $regex[0] = '/ae/';
      $regex[1] = '/oe/';
      $regex[2] = '/ue/';   
      $regex[3] = '/Ae/';
      $regex[4] = '/Oe/';
      $regex[5] = '/Ue/';   
      $replace = array();
      $replace[0] = 'ä';
      $replace[1] = 'ö';
      $replace[2] = 'ü';
      $replace[3] = 'Ä';