For over a year I have been leading the Holdingbay team on the Verified Data project. We have had the opportunity to help the great and respected team at Search Integration, especially Brian Clifton bring his product idea to market. Moving it from on of Search Integrations consulting services and his respected 2015 book Successful Analytics that covers best practice in 10 sections.
There have been many challenges and learning steps along the way, as with any startup. Especially the opportunity for one where you are offering users to speed up a quarterly long process for them into possibly a daily one. This is a mind shift that can open doors, but often needs to work at the product and sales level to demonstrate it before the application is being used.
Target organisation scale makes single analytics owner need to trust data
Large scale websites for companies that lead their sector in a country or several countries can be a run by multiple teams and departments, or even other companies. So if a single head of Marketing or Analytics needs to keep track of data and know they can trust it, it can be a taxing job. Especially when changes can be made by people who don't directly report to you or are in contracted agencies.
So identifying that they want to trust the end of month data for senior meetings and planning choices can be reduced in risk of error, buy checking the foundations are correct.
If the person responsible can point to an external service that can check and approve work and change on a rolling basis then delivery commitments are easier to sign off.
Saving analysts time and giving managers increased reliability
The task for the application is to improve the speed on an analyst checking hundreds of pages one by one in a web browser. Then look at the Google Analytics configuration for each Account, Property, and Profile view in the companies set up. It can become a financial book keeping scale flood of numbers and small details.
The people who do this are very skilled at looking at the resulting data and finding issue and fixes, but their time can be better spend than collecting all the information. Also as people are natural things they get tired and do not do massively repetitive things without error rates climbing.
Finding a solution
This is where the tool comes in like this, to be impartial and collect every page in the same way, just as a user would but fast and efficient. Giving the analysts the facts at the end to work with.
Choosing to audit another service and provide the third party reference to it can be a large undertaking. Normally when a job like this is set up to test one website it will be tuned to that use case. As different websites are made by different teams and different technologies changes of direction and implementation are possible.
The task of this project was not to audit the website it's self though but just how it used and made Google Analytics work. So this did reduce some complexity.
Building the technology for a web application
Starting with an Minimal Viable Product (MVP) for eight weeks the team worked to prove the concept. We choose the Open Source Laravel framework as a good choice for building a SaaS application. We have a strong Laravel team and knew if we needed to scale up production at any point of the build process we could have a reduced cost to getting new people up to speed, as there are lots of freelance professional developers local who work with the framework.
We kept the design process to a minimum in the MVP phase to keep focus on building out the process and the UX flow of the users through the app. Goal to get a user to login and start an audit, audit runs and then shows a report with first couple of results to show an end to end test.
While doing this and talking to the Google Analytics Apis we could map out structure and start to store data and learn about the process and where complexity was.
With that completed we scaled up the development team to four people over the year that followed. With the features for each section broken up into agile sprints. We could break changes down into small manageable parts to mean that it was easier to review and we did not have a long design phase where the ideas could not be tested, just looked at.
Building a service application that different companies can log into and see only their account safely is a design pattern called Multi-Tenant. We choose to approach this with a single database and tie all account data tables with a tenant key. So no database query could load data for display without locking on the single account. This allowed the project to scale with normal relational database patterns without putting much more load on data requests.
What Multi-Tenant does change that is different from a single account CMS that runs a Wordpress or Joomla website for a company is there is a single layout to show a report still but only the users in that tenant account can see the data in it. So if my company made audit 1, and you logged in to your company you would not see audit 1 that I made. Much like for email, you do not see my emails, but we can both use the same application like Gmail.
The problem this creates is in testing. If different people in the business process have their own accounts and are different tenants as well then we could not check out each others results. We worked round this by having one set of logins that were all on the same tenant and also our own accounts for our own tests. We added to the task lists for changes always the screengrab of the area we referenced and the account we logged in with. So that someone else fixing the issue would know how to find it, or need the requester to demo it.
Queues
To process the large number of small jobs needed to check hundreds of API calls and thousands of web pages, the tasks were broken up. With these small batches of 1-5 requests at a time, they could all be placed in a Job Queue, and processed one by one. Adding more workers to request the next job from the queue if we wanted to add more horizontal scale. Which is a something that is built into Laravel so we didn't need to code our own system, just the jobs. For the large scale we were working with and looking to scale too we chose RabbitMQ. This can process thousands of requests for jobs a minute so made it a great choice for scaling up and adding more queues for different types of jobs.
At one point we even got too busy for our own queues with 4 workers on each queue. This is because to fit our SLA with the client websites we could not got over X requests a minute and with Google and other APIs had request per minute caps. These numbers are fine when just working with one website, but if you add 10 or 100 audits running at once you reduce the available calls by diving between all of them. We did not want to go faster, even when fast would be nice, as it could result in effecting the performance of the website being tested by the application and so their customers.
A system like this is the inverse of a traditional marketing website. It provides access for just a few users on the frontend and then simulates thousands on the backend to test their website. Read more in the case study on Verified Data I talk about Token Ring solution. Result was scale with only changing the protocol, not the architecture.
Microservices
This application was not designed as Microservices but we did take those practices on to loosely couple the product at the service level. Making each service responsible for it's self by design made it easier for different developers to work on areas that did not overlap and so could move faster. The smaller pieces then could repeat patterns and learning faster and were simpler to review.
Early on we broke the worker queues out of the product to work on their own as they could be on another server. We noticed how much more work it put on the team to keep the two separate applications in sync without the need, so merged them again. Microservices tend to work better when different teams rather than single developers can be responsible to each service. The pattern was good to learn from though.
Logging
An area that became the most important in the Laravel application was Logging. When you have lots of different small jobs and the current state and changes working in the background away form the visual interface you need to be able to track and follow it. So watching the tail of the log output showed the multiple smaller parts rolling out the results a little like a printer.
When we moved the stack of application and database with other technologies to multiple servers with Docker logging to file on each server became a problem. When each instance of the application saved logs locally and more people used the system it was hard to follow changes and progress. So we brought the logs back together with a logging service Loggly. Keeping to our project principal of not spending time developing systems that were not core to the business case until we could not buy what we needed because of scale. We could have built out own ELK stack and liked to use partner SaaS platforms that were based on Open Source software to make it easier to take ownership in the future if we wanted. If we had built out own we would have needed to spend time setting it up and maintaining it, rather than being able to use the support and documentation of the partner to move us forward fast and see results first.
Moving forward
Building a SaaS application is a logger process than you first think and as an application that becomes the core of the business it will never be finished. There are always new requests from users and performance improvements to make. Also to keep up with changes in the Google Anlaytics API, that changed twice during the project and came out with a new version they were testing, we had to keep a space in the schedule for this.
Being able to have good clients on board to test the application and see how they could value the resulting reports was some of the best results. If you do not have real users testing the product and rely too long on the teams idea of what is needed and working you can find surprises or take too long to get to market. Keep the releases small and listen to the feedback to get the application to serve its users the best it can. Keep learning and providing value and the business can grow.
The technologies and processes we chose for this project proved out as working as needed. The parts we adapted and changed came more down to how we gave feedback to the user in the interface and what controls they needed rather than a technology choice. Seeing a problem working was better than a concept of it. With a SaaS much as any new tool or car, you need to get the performance data from seeing it working not just looking at the drawings and guessing. Logging performance often showed areas that could be examined but sometimes the solution would not be clear and needed a new approach, rather than a fix.
Thanks to hard work of the team and the partners who got the product this far to market, and now the roadmap to the future.
Do you have a process that could benefit from scaling as a custom application?
Send me your thoughts, We can advise you on a roadmap to production.