Data is generally grouped for the entire app in a single instance. Data is not combined with other apps, but neither is it split out to a single concern. My reasons for this are the minimum financial and management costs associated with provisioning a database exceed the benefits from reliability gains.
Use technologies that minimize ongoing maintenance of the application. This generally means heavy use of serverless technologies. The reasoning behind this is that maintenance typically exceeds development in terms of cost and time, so the additional costs for not managing infrastructure are justfied at small to medium scale.
Some cloud-specific technologies are used, but generally cloud-agnostic technologies are preferred. This generally means heavy use of containers.
Continuous Integration & Deployment is used in all cases. In nearly all cases, the flow is from GitHub to Azure DevOps/VSTS to the deployment target.
House the data to be indexed as well as other app data
Data Model is fairly well-defined, but there are nested properties and arrays, and no relationships involved that make an RDBMS less ideal. The data model is also very simple (1 collection). Azure Blob Storage is an extremely simple form of storage, basically just files with some special consideration for text files. In this case, it was used with Serialized JSON, creating essentially the world's simplest document database with only one index/lookup.
The data is VERY high touch, the indexing process is a very large amount of checking and re-checking. Azure Blob Storage is both very high performing (20,000 requests/second at time of writing) and very cheap ($0.023 per million requests at time of writing).
Our initial choice was actually SQL Server, but the amount of compute needed to handle the request load was high. Blob Storage was about 1/20th the cost for the same request load.
Azure Search is a search as a service built on top of ElasticSearch, which is highly regarded in the search arena. Azure adds a few nice features from SOLR over the top of it to make it a very compelling offer.
Azure Search comes at a fairly steep minimum level, but the vastly decreased development time deriving from not having to develop an indexer (it can automatically index blob storage) and not needing to code for synonyms (it has this feature) made it well worth it.
Through a combination of webhooks and polling, detect changes in the front end and hand them off to a queue
Azure Functions provides an easy to use code base for scheduled jobs (especially) and has nearly unlimited flexibility in terms of deployment and even language choice (this is like Lambdas for the AWS folks in the crowd, but can be deployed in containers).
A queue (and separate service) was used to actually index the data to combat the need for 100% uptime.
Provide the actual app that customers use to search the website, served through a proxy
It may seem odd to list ASP.NET Core as a front end technology, but in this case it was only used for its Razor frontend markup and to control some attributes of the response. Shopify has some specific requirements with how this response needs to look, and ASP.NET trivialized this task.
Since this is an actual public front end, you can use the example link above to see it.