Search Result Highlighting

Permalink
Could someone please verify a search block behaviour for me please? When I use the search block (8.2.1) to find "lorem ipsum" (without quotes), it correctly lists the "Web Developer" page among the hits. However, it doesn't highlight the matching text, as it does with most other hits.

After a bit of digging, it seems that getPageIndexContent() doesn't return some of the searched text; in this case, page attributes that were pulled in via a composer-based page type. The attributes seem to be marked as indexable (and are indeed found by the search). I've tried to rebuild the search index, but nothing changed.

Am I missing something?

Gondwana
 
MrKDilkington replied on at Permalink Reply
MrKDilkington
Hi Gondwana,

I can confirm this using the develop branch.

The Web Developer (and Sales Associate) page is returned in the results list, but without a highlighted match for "lorem ipsum".
Gondwana replied on at Permalink Reply
Gondwana
Thanks MrK. I'll github it.
mnakalay replied on at Permalink Best Answer Reply
mnakalay
Hey, guys, that's not a bug, that's how the code work. Here's an explanation.

The code looks for "Lorem ipsum" in the page content, the page description, and its attributes.

In our case, the expression can only be found in the page description and in an attribute. If you edit the page you will see the text under "Location (Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla massa lacus, vehicula eu interdum convallis, laoreet id lectus. Nunc turpis elit, aliquam sit amet aliquam tincidunt, dapibus vel tellus.) is from an attribute.

Anyway, the block view then does a few things. First, it tries to get the page's content highlighted using the function highlightedExtendedMarkup(). If the expression is not found in the text, the highlighting function returns nothing. So in our case, the content is not shown in the results.

What you see is the page description. What the code does with the description is to first shorten it to 255 characters and then highlight it. But this time using a different function highlightedMarkup(). That function (which is also used by the other highlighting function) returns the text with or without highlighting. And in our case, there is no highlighting because the text was truncated to 255 characters so the expression "Lorem ipsum" is not included anymore.

Hope this helps
mnakalay replied on at Permalink Reply
mnakalay
here's something you can do to test it. In the search block's view.php file look for this line
echo $this->controller->highlightedMarkup($tt->shortText($r->getCollectionDescription()), $query);

and modify it to change the truncating length like this:
echo $this->controller->highlightedMarkup($tt->shortText($r->getCollectionDescription(), 350), $query);

Here I set it to 350 and you'll see the Lorem ipsum highlighted
Gondwana replied on at Permalink Reply
Gondwana
Thanks; I understand now. The page matches because it possesses the attribute (ignoring description); the fact that the attribute also appears in the content is immaterial.

It might still be desirable for the content snippet to show any displayed attributes so it doesn't seem so fragmentary.
Gondwana replied on at Permalink Reply
Gondwana
I submitted a suggestion to github:
https://github.com/concrete5/concrete5/issues/6067...