Harvard Researchers Unveil How Strategic Text Sequences Can Manipulate AI-Driven Search Results

Large language models (LLMs) are widely used in search engines to provide natural language responses based on users’ queries. Traditional search engines perform well in retrieving relevant pages but cannot compute the information and present it as a coherent response. LLMs can overcome this inability by compiling search results into natural language responses that directly address users’ specific queries. Google Search and Microsoft Bing have started integrating LLM-driven chat interfaces alongside their traditional search boxes.

However, it is difficult to keep the traditional LLMs updated with new information due to the limited knowledge acquired during their training. Also, they are prone to factual errors during text generation from the trained model weights. These limitations can be solved using Retrieval-augmented generation (RAG) by integrating an external knowledge source, such as a database or search engine, with the LLM to enhance the text generation process with additional context. Another drawback of LLMs is the adversarial attack in which attackers use crafted token sequences in the input prompt to bypass the model’s safety mechanisms and generate a harmful response.

Researchers from Harvard University proposed a Strategic Text Sequence (STS), a carefully crafted message that can influence LLM-driven search tools in the context of e-commerce. With the help of STS, one can improve a product’s ranking in the LLM’s recommendations by inserting an optimized sequence of tokens into the product information page. Researchers used a catalog of fictitious coffee machines. They analyzed its effect on two target products: one that appears in the LLM’s recommendations and another that usually ranks second. They found that STS enhances the visibility of both products by increasing their chances of appearing as the top recommendation.

STS has proved that an LLM can be manipulated to increase the chances of a product being listed as the top recommendation. By inserting STS into a product’s information, a framework was developed to game an LLM’s recommendations in favor of the target product. For further optimization of STS, adversarial attack algorithms such as the Greedy Coordinate Gradient (GCG) algorithm are utilized in the framework, improving product visibility in business and e-commerce. This framework also helps make the STS strong enough to handle changes in the order of product information listed in the LLM’s input prompt.

The GCG algorithm finds the optimized STS by running for 2000 iterations wherein the target product, ColdBrew Master, shows improvements over the iterations. Initially, the product was not recommended, but after 100 iterations, it shows in the top recommendation, and the effect of STS was evaluated on the rank of the target product in 200 LLM inferences with and without the sequence. STS has an equal probability of advantage and disadvantage yield; however, if the product order is randomized during the STS optimization phase, the advantages will significantly increase while the disadvantages will be minimized.

In conclusion, Researchers introduced STS, a carefully crafted message that can influence LLM-driven search tools in the context of e-commerce. It can improve a product’s ranking in the LLM’s recommendations by inserting an optimized sequence of tokens into the product information page. Also, a framework was developed by inserting STS into a product’s information and optimizing STS using the GCG algorithm, improving product visibility in business and e-commerce. The overall impact of this paper is not only bound to e-commerce but also highlights the implications of AI search optimization and the ethical considerations that come with it.

Check out the Paper and Github. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 40k+ ML SubReddit

Want to get in front of 1.5 Million AI Audience? Work with us here

As we increasingly rely on #LLMs for product recommendations and searches, can companies game these models to enhance the visibility of their products?

Our latest work provides answers to this question & demonstrates that LLMs can be manipulated to boost product visibility!… pic.twitter.com/gwsGiuUdRR

— (@hima_lakkaraju) April 12, 2024

Sajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…

Source link