Thank you for these great questions!
I think the initiative that will be the most impactful is mass data collection as this is the one intervention that enables all other interventions in the AI space for animal advocacy and by focusing our first 6 months on extensive data collection and curation, we will lay the groundwork for all 3 of our planned interventions (training and deploying open-source AI systems free from speciesist bias, empowering animal-friendly organisations to integrate AI into their operations and helping AI labs and developers align their models with the interests of animals).
Because all 3 interventions require this as a first step, we save considerable time and resources by choosing to start here.
Furthermore, by open-sourcing this dataset, we further mitigate against potential risks by allowing any individual or organisation to develop their own interventions using this data. This significantly increases the potential for impact, as this first step is not only laying the groundwork for our own interventions, but potentially for all future interventions at the intersection of AI and animal advocacy.
But in terms of ranking each of those three interventions that come after data collection, it primarily depends on the timeframe in which you evaluate impact, whether you prefer interventions that are lower risk to lower reward or higher risk to higher reward and whether these interventions happen together or in isolation.
For example, helping animal organisations implement AI in their workflows is much more likely to be successful if we're helping them implement AI without speciesism and helping AI labs to reduce speciesism in their models will be much easier if we have already successfully done so with our own models.
Helping animal organisations with implementation is likely to have the greatest short term impact, whilst working with other AI labs may have the largest long term impact, but also a higher degree of risk.
As for the question about specific use case, I'm personally most excited about automated agents, chatbots and the intersection of generative and predictive AI. For example, we could have AI agents that monitor social media for misinformation from the animal industry and automatically respond to it with factual information, we could have LLMs that use real world social media analytics from animal organisations as their reward function (in other words, they would learn which kind of posts get the most reach, likes and comments and they would write more posts like that) and we could have AI-powered chatbots that personalise their responses to each individual based on what is most likely to resonate with them.
Regarding the technical questions about expected increase in performance and the difference between prompt engineering vs. fine-tuning, I'm currently in the middle of writing a literature review that addresses this question in more detail and I'd be more than happy to share it with you once I'm done to provide a more comprehensive answer, but in the meantime, I'm happy to share a few general thoughts on this question that explain why I believe the 10% increase in productivity estimate is highly conservative.
Prompt engineering techniques can work well to align LLMs for narrow use cases like creating vegan recipes (for example), but in agent-like systems (specifically externally facing ones that deal with the general public) the risk of the system becoming unaligned rises, as does that potential impact of that risk. I believe particularly as the year progresses this is going to be an increasingly important problem to solve as the AI industry as a whole is moving towards automated agents quite rapidly. There are also data and privacy concerns with the closed source models for animal organisations, many of which see this as an obstacle to implementing AI. Open source locally hosted models could potentially solve this issue.
Also, optimising for the correct reward function is something that will be very difficult to do with closed source models. For example, with enough data, we can train models that predict how different advocates rank different responses based on the type of advocacy they do, we can train models to predict how social media or blog posts will perform for different organisations etc. and we can use these as reward models to fine-tune models that are goal focused towards the needs of animal advocates. I believe as a result of this we will be able to create much more persuasive LLMs than we would through prompt engineering alone.
There's also a lot of interesting use cases for using smaller fine-tuned LLMs as tools, for example, we could create very hyper-specialised small models for something like grant-writing tailored to vegan grant-makers (using data from what grants do and don't get approved as the reward) and then have a larger LLM decide when to call that tool. The impact of something like a vegan-specific GPT would be much greater if it had access to a wide range of small fine-tuned models, prediction models and retrieval augmented generation, even if we don't succeed in creating a superior general vegan LLM (although I am very confident that we will be able to create that as well).
Regarding the question around influencing AI labs, the animal advocacy movement has a long history of successful corporate and legislative campaigns that we can learn from and apply within this space. One thing that stands out to me is that the most successful corporate advocacy campaigns tend to have a clear alternative provided by campaigners and often support in helping corporates implement that alternative. I believe for us to be successful in influencing AI labs, we will likely also require a clear alternative, in this case by building animal-aligned AI models, evaluations and benchmarks that we can use to help AI labs in aligning their models.
No problem, I likewise apologise for taking so long to get this response back to you as well!
I certainly agree that hallucinations are a huge limitation for using current LLMs in chatbots or automated actions of any kind. Hallucinations are far more likely to occur on questions outside of an LLMs training data, so training an LLM specifically on data relevant to animal advocacy should reduce the frequency of hallucinations.
In addition, the database we build will be used for retrieval augmented generation to ground the responses in fact and provide citations for sources in addition to using it as training data.
These two approaches combined with training techniques designed to reduce hallucinations (such as converting graphs showing relationships between objects to text for training data and using data augmentation to increase diversity in the dataset) will make the LLM we train far more reliable and less likely to hallucinate on animal rights issues.
I should clarify that when I say we are training an LLM, we won't be doing this entirely from scratch. We will begin with a pre-trained state of the art open source model, then continue pre-training, before fine-tuning and finally building it into specific tools for specific use cases. This requires far less data and compute power compared to training an LLM from the ground up.
As for how much data we can collect, we've surveyed more than 100 leaders and employees of animal charities and the willingness to share data is very high, more than 70% are willing to share data for training.
Your point about developing techniques to clean speciesist data out of corpuses is an excellent one and we absolutely are planning to do this as well. After we collect data from animal advocacy organisations, the next step is having volunteers provide human feedback on how different responses affect animals. We will use this data to create speciesism detection and ranking models (as well as a diverse range of models predicting other relevant information, such as how logically impactful, culturally sensitive or generally persuasive a message is), which we will open source to allow anyone to use them to clean any dataset of content that is harmful to animals.
This is quite a complex topic and it's often hard to detail our plan accurately in a succinct way as a result, so I've written a blog post on our website that explains our approach in more detail here: https://www.openpaws.ai/blog/why-animal-advocates-need-our-own-large-language-model
This approach is guided by our comprehensive literature review, which can be found here: https://www.openpaws.ai/blog/literature-review-on-developing-artificial-intelligence-to-advocate-for-animal-rights
Thank you for sharing that paper about the WMDP benchmark, there are certainly a lot of benchmarks that could be adapted to measuring the impact of content on animals. There's also recently been work on developing benchmarks specifically for detecting speciesism, like the AnimaLLM proof-of-concept evaluation from the paper "The Case for Animal-Friendly AI": https://arxiv.org/abs/2403.01199
I definitely agree that benchmarks and evaluations in general will play a huge role in aligning AI with the interests of animals and this is something we aim to contribute to through our work as well wherever we can.