kind of major issue with the search block (lack of utf.8 support)
Permalink
just started looking into this, but the current search block seems to strip everyhting that isn't latin1… I develop sites in Swedish and it therefore strips all our precious little "åäö" characters from searches rendering it rather useless.
(the create aliases from page name thingy also strips all non latin characters - What about character folding rules there?)
For the search maybe use Sphinx as the search engine? Stemming and character folding rules are kind of neat having around :) not to mention support for misspelling using for example Aspell dictionaries / and or custom dicts…
I haven't the faintest what kind of pain that would be to implement but it's an idea…
As I would really like to se a site search become usable.
They always suck beyond crap.
What I love about google is the spell corrector, I have found myself not even bothering to check spelling when searching as I know that it will 9/10 times find what I meant.
A feature which should be available within a sitesearch… or you might as well just start using the "site:" rule on google to do searches :)
anyways, anyone have any experience with the search block stripping characters from the search term? how to fix?
(the create aliases from page name thingy also strips all non latin characters - What about character folding rules there?)
For the search maybe use Sphinx as the search engine? Stemming and character folding rules are kind of neat having around :) not to mention support for misspelling using for example Aspell dictionaries / and or custom dicts…
I haven't the faintest what kind of pain that would be to implement but it's an idea…
As I would really like to se a site search become usable.
They always suck beyond crap.
What I love about google is the spell corrector, I have found myself not even bothering to check spelling when searching as I know that it will 9/10 times find what I meant.
A feature which should be available within a sitesearch… or you might as well just start using the "site:" rule on google to do searches :)
anyways, anyone have any experience with the search block stripping characters from the search term? how to fix?
right, seem to have those things in place. and yes it does indeed index utf8 characters.
but… when actually searching I get conflicting results.
for example searching for "Blocks vänster" I get results (including utf8 chars)
http://foretagsfokus.se/index.php/examples/search?search_paths%5B%5...
however when searching for only "vänster" I get nothing… why is this?
http://foretagsfokus.se/index.php/examples/search?search_paths%5B%5...
but… when actually searching I get conflicting results.
for example searching for "Blocks vänster" I get results (including utf8 chars)
http://foretagsfokus.se/index.php/examples/search?search_paths%5B%5...
however when searching for only "vänster" I get nothing… why is this?
http://foretagsfokus.se/index.php/examples/search?search_paths%5B%5...
I see your point...
That's indeed weird.
If the search query is no more than 3 characters... and then you could trouble shoot that it is about MySQL setting...
http://www.concrete5.org/index.php?cID=9672...
But your string is long enough....
Oh I found out why~!
Hey check your HTML code.
It's "vänster" in HTML... This is the reason why it doesn't come up... I think.
That's indeed weird.
If the search query is no more than 3 characters... and then you could trouble shoot that it is about MySQL setting...
http://www.concrete5.org/index.php?cID=9672...
But your string is long enough....
Oh I found out why~!
Hey check your HTML code.
It's "vänster" in HTML... This is the reason why it doesn't come up... I think.
I believe concrete5 / tinymce escapes characters… damn, how would one go about solving this problem? but how does the search find vänster when coupled with the word next to it? I'm at a loss…
When I translated concrete5 to Japanese, I had to re-package TinyMCE to Japanese version which are under
/concrete/js/tiny_mce/
You may have to do the same with Swedish one.
But one simple solution would be to add Google Search field on your web site and have them index your site.
/concrete/js/tiny_mce/
You may have to do the same with Swedish one.
But one simple solution would be to add Google Search field on your web site and have them index your site.
We, Japanese, have a different reason why we cannot use MySQL search index.
But I was able to index the search into MySQL.
There are a couple reasons why you cannot do this.
1. Collation of your MySQL Database (utf8-general-ci)
2. PHP internal encoding
3. Your server may not have mbstring installed
This is my PHP.INI setting
I checked that search block was able index UTF8 characters.