Everyone in the startup world has heard of the PayPal mafia, the founders and early employees of PayPal and associated companies. The dons of this mafia include Elon Musk, Peter Thiel, Reid Hoffman, and others. They have collectively created companies valued at over $2 Trillion including Tesla, SpaceX, LinkedIn, Palantir, YouTube, and Yelp.
More recently Palantir itself has gained quite a reputation as a founder factory, spawning over 100 companies that have raised over $11B and include household names like Affirm, and Partiful and data startups like Hex, which we talked about a couple weeks ago.
Speaking of the data world. Do we have our own version of a ‘mafia’? Perhaps…
If we look back to 2008, Jake Stein and Bob Moore had founded a little analytics company called RJMetrics in Bob’s attic in Philadelphia.
As an aside, we like to think of Silicon Valley as the center of innovation in America but Philly’s innovation roots run deep, Benjamin Franklin was tinkering with kites and keys and inventing bifocals, swim fins, and modern democracy in the City of Brotherly Love in the 1700s. I digress.
Bob and Jake set out to "inspire and empower data-driven people" by integrating data spread across a wide variety of disparate systems (sadly still a challenge). This was well before cloud data warehouses, in fact, AWS itself had only existed for 2 years at that point. GCP launched that year, Azure was still 3 years away.
Bob and Jake had to build a full-stack analytics platform, from data ingestion all the way to dashboards, in a SaaS tool. I’m not an expert in the landscape at the time but it was pretty revolutionary, the alternatives were primarily on-prem enterprise software which were non-starters for smaller businesses.
RJMetrics was a pioneer, a proto-modern data stack, and it would come to have a big impact on the market. Bob and Jake bootstrapped for several years and built a small team, including Chris Merrick who joined as an engineer but would later be promoted to VP engineering.
In 2012, they raised external capital, a $1.2M seed round. Then, they quickly raised a $6.2M A round in 2013 and a $16M B round in 2014. With this funding, the product advanced into modules we recognize today: an Analytical Warehouse, Transformation Cluster, and Visualization Interface—all in one platform.
It was also during this period that they hired Tristan Handy, a name you might recognize from past posts, to lead marketing.
Unfortunately for RJMetrics, another trend was emerging at the same time, Redshift was launched in 2013, Snowflake in 2014, and it became more attractive to run your data workloads on your own dedicated data warehouse. Moore referred to increasing customer acquisition costs leading to a 25-person layoff in early 2016 announced alongside an increased focus on the “Pipeline” product (foreshadowing things to come).
In parallel with those rocky times in early 2016 at RJMetrics. Tristan Handy began bootstrapping Fishtown Analytics, a data consultancy, alongside other RJMertrics folks, Connor McArthur and Drew Banin, (and occasionally Yev Meyer.)
The RJ/Fishtown crew also started working on dbt, short for data build tool, an open-source project for data transformation on cloud warehouses. Chris Merrick, then VP of engineering at RJ wrote the first lines of code to Drew Banin’s joking frustration.
The combination of struggles with the core RJMetrics business and the success of the “Pipelines” product led to a few key events that summer. Magento (now Adobe Commerce) acquired the RJMetrics BI/Analytics product and business, which was rebranded to Magento Analytics. Robert Moore would join Magento to lead that product.
A key part of the Magento deal was that the Pipeline product would be spun out as a new standalone company called Stitch. Jake Stein would lead Stitch as CEO and Chris Merrick would join as CTO while Bob remained involved in a chairman role.
We now had all the core elements of the early data stack available as independent solutions. Stitch for ETL (competing with Fivetran), Cloud data warehouses in Redshift, Snowflake, and BigQuery, dbt for transforming data in the warehouse, and standalone cloud BI solutions like Looker (founded in 2012).
The next act in this play was Stitch’s release of Singer, an open-source project for ETL, in 2017. They argued, as Airbyte later would, that open source is the best way to manage support for the huge number of connectors required to support all the SaaS apps in the market. Singer offered a standardized framework that a developer could build on to connect to any arbitrary API, database, etc.
Stitch would continue to support Singer as a managed service.
Around this time, the cloud data warehouses really started cooking, and ETL tools were hot. The sales teams at Snowflake, AWS/Redshift, and BigQuery were eager to send business to Fivetran, Stitch, and Matillion (the major players in those early days) because a warehouse wasn’t going to add much value unless you could get your data in. See Google trends for those three players below. Pre-2018, they were all neck and neck. This analysis likely underrates Stitch, as I had to add the qualifier “data” to the keyword to avoid other uses of its name. Side note, it would be interesting to figure out what allowed Fivetran to jump ahead in awareness above Matillion in 2022, but we’ll leave that for another day.
In addition to the partnership-driven growth, Stitch also did a great job with search engine optimization (SEO). ETL in particular lends itself to SEO, the specific problems that data have are very easily articulated with keywords, e.g. connect Salesforce to Redshift. Stitch did a good job of creating this content and ranked well. See metrics from SEMRush below, Stitch was absolutely crushing the competition in search until quite recently.
From what I can tell, Stitch was doing pretty well at this time. They caught the attention of Talend, the leading on-prem open-source ETL company at the time, a public company. In November 2018, Talend announced the acquisition of Stitch for $60M cash. That doesn’t sound like a bad exit, but I do wonder if there was more to the story than simply a good exit. It might be that a complicated cap table due to Stitch having been a spin-out from RJMetrics made it difficult to raise money. At that point, Matillion had recently raised a $20M Series B. Fivetran was still pretty early and would announce a $15M Series A a month after the acquisition. So, whether it was just a good exit, cap table/funding challenges, competitive funding pressures, or some combination of the above. The Stitch team decided to sell.
For the first couple of years inside Talend, Stitch continued to grow. When I met with Data teams in 2020/2021, my sense was that Stitch had a greater market share than Fivetran, at least in terms of customer count.
That said, even then, the product and support had started to fall off. From the outside, Stitch didn’t seem to be a priority for Talend.
In 2021, Talend was acquired by PE firm Thoma Bravo, which really spelled the end for Stitch. The founding team had left by early 2022. In 2023, another Thoma Bravo-owned company acquired Talend, and it was really over for Stitch.
It seems that Stitch could have been Fivetran, but perhaps it was not to be. Regardless, the end was a new beginning, leading to the further expansion of the mafia.
Jake Stein left Talend in 2020 and started Common Paper, a startup focused on streamlining contracting process. Something I actually considered pursuing myself earlier in my career.
Chris Merrick left Talend in 2022 to start Omni with co-founders Colin Zima and Jamie Davidson who where both early members of the Looker team. Google acquired Looker in 2019, and in a somewhat similar process to what happened to Stitch at Talend, allowed a well-loved product to lose at least some of it’s luster. So Chris, Colin, and Jamie, set out to create a better Looker and pick up the slack. Every indication from the outside is that it’s going well, they have raised a total of $97M, most recently in a $69M Series B, and are above $10M ARR run rate.
So that’s the Stitch side of the story. Almost…
Back to Singer, the open-source project. It lives on. With Stitch disappearing into Talend/Qlik the data team at GitLab decided to spin out a project called Meltano which was built around Singer as a standalone company in 2021.
Around the same time, Airbyte came out of YC W20 cohort, they sold a compelling vision of open-source as the solution to data integration (sound familiar?) and my understanding is that they borrowed heavily from Singer, at the very least the connectors/taps, and perhaps more. Airbyte would quickly raise a $25M series A in May 2021 and, in what was perhaps the best example of Covid/ZIRP era VC/startup excess, raise a $150M series B at a $1.5B valuation. At the time Airbyte was reported to be making less than $1M in revenue, they top ticked the market with a 1500x revenue multiple.
Alas, Airbyte has yet to live up to that in the way that others in the modern data stack companies have. And with that, we can close out the Stitch chapter.
dbt is the big breakout star of the RJMetrics mafia. Tristan, Drew, and Connor continued on as Fishtown Analytics for the first few years, but dbt was gaining steam. In 2020, it took off.
What had been a consultancy was increasingly a tech company focused on managing an incredibly popular open-source project. Cut to 2021, and the Fishtown Analytics name was retired in favor of dbt Labs. dbt has since raised a total of $416M to date and recently announced that they passed $100M revenue.
Those are the heavy hitters of companies founded by RJMetrics employees or derived from associated projects.
It’s a bit outside the data world proper, but Bob Moore left Magento in 2018 and founded Crossbeam, a partnership platform. Data is certainly the underpinning of the platform, as it’s effectively a data clean room for B2B partnerships, but it’s not in the modern data stack, other than facilitating partnerships between data vendors. We’re users of the platform at Streamkap.
Some other former RJMetrics employees who have started companies include Michael Drogalis of Shadowtraffic, a Kafka Traffic Simulation, and Andres Recalde of Muffin Data, a CPG Analytics platform.
If we total it up, RJMetrics alumni founders have raised over $800M!
Beyond founded companies, former RJMetrics folks are everywhere you look in data, at Rudderstack, Dagster, Snap, Google, Motherduck, Datadog, and others.
That’s my outside look at the RJMetrics story. Let me know if others are in contention. I have my eye on it.
Warmly,
Paul Dudley