NOAH
JACOBS
WRITING
"if you have to wait for it to roar out of you, then wait patiently."
- Charles Bukowski
Writing is one of my oldest skills; I started when I was very young, and have not stopped since.
Age 13-16 - My first recorded journal entry was at 13 | Continued journaling, on and off.
Ages 17-18 - Started writing a bit more poetry, influenced heavily by Charles Bukwoski | Shockingly, some of my rather lewd poetry was featured at a county wide youth arts type event | Self published my first poetry book .
Age 19 - Self published another poetry book | Self published a short story collection with a narrative woven through it | Wrote a novel in one month; after considerable edits, it was long listed for the DCI Novel Prize, although that’s not that big of a deal, I think that contest was discontinued.
Age 20 - Published the GameStop book I mention on the investing page | Self published an original poetry collection that was dynamically generated based on reader preferences | Also created a collection of public domain poems with some friend’s and I’s mixed in, was also going to publish it with the dynamic generation, but never did.
Age 21 - Started writing letters to our hedge fund investors, see investing.
Age 22 - Started a weekly personal blog | Letters to company Investors, unpublished.
Age 23 - Coming up on one year anniversary of consecutive weekly blog publications | Letters to investors, unpublished.
You can use the table of contents to the left or click here to check out my blog posts.
Last Updated 2024.06.10
2024.12.01
LXXVI
Fair warning, this pretty much devolved into a BirdDog white paper and is a bit longer than most of my posts.
I can promise you that the length doesn’t make it boring and it explains both what BirdDog is and why it’s an exciting business from a technical and go to market stand point.
-------------------
I’ve talked about efficiency as if it were this panacea, this great big cornucopia that contains the feast that shall save humanity's soul from the encroaching darkness of waste and bad engineering. I still think it is.
Without graphics cards providing a 50-100x speed boost over CPUs when training and running LLMs, we would likely not have GPT in anything near its present form today.* Moreover, without the miniaturization of computers from room sized affairs to laptops, our economy would look far different than it does now.
Multiplying the speed of something by 50x has an asymmetrically powerful effect. You do it once, and you get the benefits forever. What took OpenAi a few years to get from GPT 3.5 to 4o with GPUs likely would have taken a few decades without similarly performant hardware.
Part of the reason I’ve been so excited by this notion of efficiency is because of what I’m starting to see with BirdDog. I don’t often use this blog as a soapbox to preach the gospel of Jack and I’s venture, and I certainly don’t intend to make a habit of. Still, I’d like to explain why the notion of extracting structured data at scale is so exciting and what sorts of doors it opens functionality wise.
*Yes, the ability to run models on cpus and in the browser are making leaps and bounds, but that seems to be a consequence of having models to optimize for cpu and wasm in the first place.
The existence of BirdDog is predicated on the assumption that information has objective, quantifiable value to sales teams. Whether it’s by way of increasing response rates, or meetings booked, or number of meetings that are “qualified,” you can draw a line from information you give a sales team to the revenue the team generates.
This assumption is substantiated by the fact that sales teams typically have some Rube Goldberg esque system for finding out as much relevant information as they can about the companies they can sell to. Their processes typically involve some mix of a contact database tool, LinkedIn Sales Navigator, oftentimes perplexity or gpt, and an intent data provider. Sometimes, sales teams even hire research teams in India and the Philippines to find the info they need. Below is a flow chart that reflects one of our user’s pre BirdDog sales research process:
Caption: Yes, someone actually told me that this was his research process
So, from a user perspective, we are a sales research tool. We take the long list of companies they might sell to and answer as many questions as we can about each company. This list of questions is their “wish list”, stuff they’d try to research with these multi step methods or figure out on the first few discovery calls with prospects.
Then, we rank these companies based on what we found so the user can talk to the most promising leads.
Why is what we have “special?”* The best way to understand that is to understand what we are trying to do and then understand what our competitors appear to be doing instead.
Effectively, we are extracting structured data from noisy, unstructured data at scale.
Very specifically, the problem is locating user defined data about some entity across hundreds or thousands of entities. Do note that this statement is stripped of any reference to sales people. It is a problem that is also faced by consultants doing market or competitor research, investors researching stocks (buy side & sell side), venture capitalist, private equity firms, & marketers.**
You’ll also notice that the problem statement hints at another assumption we’re making–user defined (and therefore often non commoditized) data is more valuable than commoditized data. Understanding either or both: 1) the high level of information nuance that sales people in a given field learn to look for 2) the notion of easily accessible information getting “priced in” to a system should be more than sufficient to convince you of this assumption.
How are our competitors approaching this problem? There are two dominant answers right now:
Find a commoditized data set that approximates what the sales person wants to know, namely intent data.
Use “one off” company by company enrichment to find the specific data the user wants.
Our solution is an attempt to provide our users with the benefits of both and the defects of neither.
*Both Jack and I are thoroughly convinced that to make us both independently wealthy, we do not need a software business that is “special.” However, having a “special” one increases the ceiling and makes it easier to grow.
**Also note that I’ve needlessly limited myself to use cases for which the entity in question is a company
Intent data, the most common commoditized data we compete with, tells you if someone at a company was searching for a specific term.
If you are trying to sell catering services, maybe you’d have search terms you’re watching, such as “catering” or “event catering.” Then, when someone at one of your prospects searches for that, you are alerted.
This seems cool, but the degree to which it works is very limited. If the prospect searches your company’s name, for instance, it can be an indicator that they’re interested. However, it doesn’t actually let you know who at the company did that search. At a large firm, it is still like finding a needle in a haystack. This issue is compounded in the likely event that the firm searches for your product category (“donuts”) rather than your specific firm or product (“krispy kreme”).
Still, the benefits of this sort of commoditized data is that you can stack rank your prospect list and create triggers. Of your 500 accounts, intent data can give you a reason to focus on some of them more than others and promises to let you know immediately when there’s “good timing.”
Given its low hit rates, intent data is used with an “it’s better than nothing” attitude. I’ve met two salesperson who said it was good, and in their very specific use cases, it sounds like it is. As for the rest of the hundreds of sales people I’ve spoken with, none are impressed.
It’s sort of like you’re looking for people whose house is or was recently on fire based on who has gone to the fire extinguisher store. Isn’t it better to just know whose house is actually on fire?
An increasingly common feature of sales tools is to use an agent to get specific data about a company. Typically, that just means breaking a user’s query into a google search, having an LLM read the results and check out the sources, and come back with an answer.
In effect, this successfully approximates what a user would learn via a google search and a few minutes of research. Convenient, but not earth shattering.
To systematically evaluate info from many sources at the same time, you’d need pretty complex infrastructure or tooling; at that point, the fact that it was an agent would likely be a footnote.
Additionally, these tools function as a sort of “point in time” reading. Meaning, you can expend the energy to get the answer once, but how you will turn a natural language response or even a binary answer into a nuanced alert is not obvious. If you keep asking the same question over the first page of google, you might as well just use the gemini answer at the top of the search results.
Simply re running the same agent workflow whenever there is “new” news doesn’t seem to be much better, either. Imagine if the user wants to know specific details about any potential merger. Given lots of redundant news pieces, you’d need some way to really sift through and make sure you didn’t keep bothering users about the same facts that have been reported for the last month. If you spam, your system becomes noisy and less relevant.
Besides the point, while it’s somewhat of an abstract measurement, our bet is that the “cost per information”* of regularly running such low depth queries consistently across your prospect list would be quite poor compared to what we’re doing.
*Descriptive, I know. What I will say is that in the absence of non LLM based information evaluation tools, the cost for an agent to increase the number of sources would be linear, whereas there are “foreseeable” ways to keep that cost per source logarithmic or even constant by focusing engineering time on tools or pipelines rather than agents.
Once again, what is BirdDog doing technically and why are we so obsessed with doing it efficiently?
We want to make it super easy for users to find and monitor the data they specifically care about, whatever it is. House fires, not fire extinguisher shopping.
Due to the expense and ineffectiveness of finding the data consistently and then monitoring it either manually or programmatically, sales people cannot get the benefits of prioritizing or monitoring their prospect list. Intent data might be “better than nothing” and agents are “cool,” but in the absence of more directly relevant data and more sophisticated analysis tooling, respectively, both are leaving significant value on the table.
Instead, we are myopically focused on extracting the data the user specifically wants at scale in a way that dynamically ranks their prospect list as new info comes in.
Let’s make this real with some numbers that aren’t very far removed from reality.
A firm paying a virtual assistant (VA)* can maybe expect research to be done on 5 company per hour. At $3/hr, that’s $.60 per account, not considering any tools they’re paying for on behalf of the VA’s, or the cost of managing them or validating their data, which can be quite high. But, we’ll be generous and ignore these.
Let’s say it costs us $.01 to answer one question for one prospect company as well or better than such a VA. That seems small, but for 20 questions, that’s still $.20 per account. A good increase over the human cost, but not insane.
Now, if you could get the cost of answering a question down to $.005, and then the cost of checking for an updating a change in the answer down to $.001, now, you could be one third the cost of the VA AND check for updates once a week for an entire month. That’s without mentioning that adding more companies or questions to a VA’s list increases the costs of management. Maybe logarithmically, but it’s still an increase.
By these standards alone, this would be a vast improvement over the status quo. You can inexpensively and accurately get the info you want and monitor it. With cost improvements equal in magnitude to the differences in the example still on the table for us, and the functionality I mentioned either good to go or in progress, I seriously believe BirdDog will soon feel like magic and give users functionality that is not available elsewhere.
*The fact that I’m using va’s here and not the cost of a sales person doing the research to get to the “raw cost” of processing the information should reveal that I’m not pitching you, the reader. If we were talking in terms of the value that we “sell” to sales teams, it is not time. Indirectly, it is revenue, by increasing the number of SQL’s they can expect to get given all other inputs, such as time spent prospecting, held constant. We want a buyer thinking about cost per revenue more than we do time.
-------------------
There you have it: BirdDog.
I do hope that answers more questions than it creates, but feel free to ask me about anything that was not made clear.
Live Deeply,