Express Objects not being indexed in search results (8.1.0)

Permalink 1 user found helpful
Search is functioning normally on my site, but when I search for information contained in an Express Object, nothing appears.

For example, I have a team of people who are all express objects and output to a Leadership page. When I search for the name of a person, the search says they can't be found. But when I go to the Leadership page, the name is there. So my theory is that the express objects aren't indexed in the system.

Is there possible a toggle I'm missing or is this a limitation due to the relative newness of express objects?

I have checked and these pages are not excluded from the search index.

Any ideas? Thanks!

 
Ronaldo replied on at Permalink Reply
Ronaldo
Hi fatwreck,

Did you have any "solution" to this question? I need help too...

Thanks
fatwreck replied on at Permalink Best Answer Reply
Hi there,
Since it was time sensitive, we ended up ditching the C5 built-in search and set up Sphider 1.5.1. It's a database search engine thing and wasn't too tough to get up and running. Make sure to not use the original Sphider since it uses out of date php. The one I'm linking to is a modified version based off the original to work with more modern standards.

Good luck!

Check it out here:
https://www.blog.worldspaceflight.com/downloads/...
OKDnet replied on at Permalink Reply
OKDnet
I'm just guessing here, but why not upgrade to 8.2? The release is all about bug fixes, and it contains a lot of them (fixes).
ebirt replied on at Permalink Reply
We are running 8.2, and are running into similar issues on our site. I'm curious to hear some specific ways to move forward with getting express object data that is displayed in blocks to be indexed by search.

Thanks!
cryophallion replied on at Permalink Reply
I just discovered this too. However, I think that due to how search is done in C5, it may not be feasible in house.

Right now, search I believe is done by reading the block on each page, and the specific attributes for that block, and the page attributes. But Express entries can be in many places on many pages, depending on how the data works. It would have to do the block queries for express just to see if the data is on that page, and that would be quite the huge endeavor on data heavy sites.

So, either you make each item its own page, so it can search, or you install a 3rd party indexing program.
ebirt replied on at Permalink Reply
Our solution is a bit of a workaround, but is not completely inelegant:

I created a new text attribute for all pages, and this attribute is set to be indexed by search. It's called something like "Generated Keywords"

The controller for the block that displays the express object was then extended to parse keywords from the express object text fields and store them in "Generated Keywords", thus exposing the content to the search on a per page basis.

One final step was to add a second page attribute called "Remove Keywords". Any words in this field are removed from "Generated Keywords", thus giving our content manager a higher degree of control.

It's still no Google, but it has been pretty effective.

Hope that helps.
cryophallion replied on at Permalink Reply
So, did override the express detail block controller, or create your own new block type for this? It's a fascinating idea. The only concern is how to do this in a "generic" way, so you don't need a different block type for each. Or does it search for any text/textarea attributes and use those?

Hmm. So, let's say you change an express entry. In this case, the search will only be updated when the entry is visited, not changed in the backend. That's interesting.

The other issue here is when using generic pages. So, let's say I have the following express entities:
`Country`
`Regions` (since some have states, provinces, localities, etc)
`Cities`

Country has a one to many relationship with Region. Region has a one to many with Cities. I do not want to have to create pages for each one. So I use the express detail to list all the cities, with links to a region page that has another detail that shows the region's info. That page also has another detail block to show the cities, and those link to the city page which has the detail block for cities. Each is just a simple block template to show the correct info.

In these cases, I am not sure the controller can parse out the info for search, as each entity does not have its own dedicated page.

Am I correct in this assumption? I am trying to be able to make datacentric sites, but without having to generate (or update, or delete) pages on every change. That is the quandary here.
ebirt replied on at Permalink Reply
I'm not sure if it helps or not, but we are starting to look into google site indexing as a more robust search.
Gondwana replied on at Permalink Reply
Gondwana
I'm interested to know how this goes. Google can only index pages that it crawls, of course, so there'll need to be links or a sitemap that causes google to generate all of the pages you want indexed.

I'm currently working on an 'advanced search' add-on for c5 (boolean expressions, etc), but it uses the built-in c5 search index so it won't expose pages that could potentially display express objects if they haven't been added to the index.
ebirt replied on at Permalink Reply
@Gondwana, thats great, keep us posted on your progress. Search improvement has been something our content creators have been asking for.

We are only concerned about indexing pages that show up in the generated sitemap.xml, so it sounds like it could be helpful for our needs.

Thanks!
Gondwana replied on at Permalink Reply
Gondwana
@ebirt Okay, here's the current status...

The actual 'advanced search' bit works. It can handle AND, OR, NOT, "literal phrase", (sub-query), and keyword relevance up/down. This took about two days to get going—not because I'm great, but because the c5 core already has the plumbing for FULLTEXT queries built in (appropriate data in tables, hooks for custom queries, etc). I was utterly impressed at that level of forethought.

I've got a small pop-up tooltip thing that gives a summary of the advanced search syntax.

The search won't support queries such as '...where attribute = "x" and description = "y"'.

What's slowing me down is getting the results to display nicely. I need to coalesce match extracts when they overlap, otherwise the result can contain partially repeated text. That looks unprofessional.

My add-on will be a free drop-in replacement for the built-in search block.

I see you've tackled the other side of the problem: getting express data into the search index. I was worried that my efforts could be of limited use if that couldn't be done, so I'm glad you've overcome that.
cd13sr replied on at Permalink Reply
I also need my Express detail pages content searchable. Does anyone know if this is possible?
Gondwana replied on at Permalink Reply
Gondwana
This is something I want to look into (as indicated above). I can imagine that it could be very complex and might not make any sense at all in some cases, since the express objects displayed on a page may change.

I guess the way in ishttps://documentation.concrete5.org/developers/working-with-blocks/c...

I haven't investigated whether the default blocks that emit express object data make use of this.
Gondwana replied on at Permalink Reply 1 Attachment
Gondwana
If anyone wants to test a crude search block that can handle some advanced search syntax, try the attached.

This block doesn’t add any content to the search index. It can only search content that has been added by other blocks. In particular, note that express object data isn’t added to the search index by the default express blocks.
MrKDilkington replied on at Permalink Reply
MrKDilkington
@Gondwana

Thank you for creating this complex search block and allowing others to test it.

On install, it throws a "ParseError syntax error, unexpected 'const' (T_CONST), expecting variable (T_VARIABLE)" error. It looks like the error is caused by visibility modifiers used on class constants when using a PHP version less than 7.1.
Gondwana replied on at Permalink Reply 1 Attachment
Gondwana
@MrKDilkington You are, as ever, exactly right. Thanks for finding this, and for solving it too. Attached is a version that has been tested on php 5.6.

(There could be other artefacts since this block was forked from the search block in c5 8.2.1—I've been working on it for that long! Unfortunately, aligning it with the lastest c5 will be difficult because so much has changed.)
surefyre replied on at Permalink Reply
surefyre
V8.4.2

I have a field 'Course ID' which is populated with a string. Unable to find a course by it's Course ID even when editing the course, copying its ID and pasting it into search or adding Course ID field to advanced search and putting the value in there.

Course ID is ticked to display in entry listings, etc. Over 70 pages of objects to go through by hand now, presumably...

Have run search index automated jobs manually.