Raise your editorial standards
Alex says: “Be concise and raise the game of your editorial standards.”
Why is being concise more important nowadays?
“Machines, LLMs, and agents are all ingesting content using tokenization. The more tokenization that's used, the more processing power is required. It’s almost like the content version of site performance and site speed.
If there's a slow-loading site, it's going to take a while for people to load, which might impact UX and other metrics like CTR. But when it comes to content, content for content’s sake was a bit of an old, grey hat technique of just scaling content and doing any old thing to get as much exposure, and as many keywords, as possible. Now, being too detailed doesn't always help. Don't blabber on too much, is the informal way of saying it.
Often (especially with tech or startup sites), you go to the About page, read four sentences, and you still don't know what they do. They use very colourful, fluffy, and salesy words to impress the human. Right now, though, the audience is not just a human; machines are the audience as well. They're ingesting that content, and they don't get sold by that kind of language. They get sold by direct language: things that get to the point, answer the questions, and solve a problem. That is the direction content should be going.
We've let ourselves down over the last few years by lowering the standards of editorial in general, so that anyone can make content for content's sake. Now, this will take away content mediocrity and raise those standards – not just in general everyday content, but also for journalistic editorial content online. They will have to adapt to higher standards in order to have the best exposure.”
Was the old keyword-stuffed, grey hat content just a case of SEOs following the algorithm?
“Everyone was not only chasing the algorithm but also trying to find and exploit loopholes in order to rank, gain visibility, and get some extra sales or conversions. Whilst that's good in the short term, producing low-effort content over and over again is just a black hat technique. It's low-end automation at scale. This year, we've seen enough of people trying to scale content with AI.
On Twitter, you’ll see someone saying, ‘I scaled up content, look at this graph that's going up, isn't this great?’ I will favourite them so that I can check in six months later and see how it's going, and not one of them is still gaining momentum or growing. They've all been penalised or they’ve scaled back because it just hasn't worked in the long term.”
If you work for a business that’s been publishing a lot of content for the last 10 or 15 years or so, is the first step to review and prune that?
“Yes. It sounds familiar to when the HCU first started out, doesn’t it? There was a lot of content cleaning and bringing pages together.
If you've got five pages that are covering very similar things, because you were doing content for content's sake back in the day, maybe now it's better to consolidate those things into one long-form piece of content. Make it more editorially concise by taking out repetition and filibustering that was only there to pad out words and fill some keyword structure. It's not just about how many keywords, how many words, clustering, etc.
Internal linking has always been important, but it's even more important now, for the flow of LLMs. Instead of a giant pyramid that keeps going down and down to 9 levels, you want to flatten out your IA a little bit more, so it's easier for a machine to navigate, as well as a human. That works with content as well. Make it a bit flatter. Don’t go into a topic, then a subtopic, then a subtopic of that, and a subtopic of that. Not many people can do that in a legitimate and concise way.
Bringing that together and removing any content that is a waste of time is going to be a good thing for the internet as a whole.
Some of it is basically web spam. I’m sure everyone remembers the Coldplay concert where those two people were caught cheating. For the next four weeks, the internet was just full of it. As soon as you've got three or four publications talking about it, you don't need any more. Then brands started doing the annoying thing of putting their own comedic take on it.
To me, that is a waste of time – and it’s a waste of time to an agent and an LLM as well. They just want to understand the source, what happened, and synthesise an answer. However, we go over the top to try and get those visits and some conversion going on. It’s not going to work as easily in the future, or in the longer term.”
You say that you should structure your content as if an LLM is your primary reader, so how do you do that?
“Firstly, make sure all of those H1s, H2s, etc., are in there, but it goes beyond that.
Anything that can add formatting to text may be relevant. I know it sounds weird, but things like italics, bold, and table structure are quite important now because LLMs ingest that. They’re not just reading words; they’ll understand those things.
They also read a lot of markdown. If you’re in a more technical vertical, then it may be more useful to make markdown versions of that, because that's what LLMs love. The one thing that's unanimous in the world of AI is that LLMs like to ingest markdown more than any other output – including markup.
Markup is essentially markdown that's been made to be more appealing to the human eye, but the human eye isn't the only thing looking at our content anymore. There's a robot looking at it. When it comes to tokenization, markdown and good structure help with that.
OpenAI have their own tokenizer that you can try for free, where you copy and paste words, and it shows you how it tokenizes things. It’s a helpful calculator, and you can use that as a good test. Reducing the amount of processing power an LLM would use to ingest all the content on a given page or site is going to be advantageous.
LLMs have a cheap mentality. They'll try and find the cheapest way to complete a process because that's the fastest, most efficient, and most cost-effective way of doing so.”
You also advocate exploring and experimenting with new open standards, like llms.txt. What is that, and why should SEOs be using it?
“Whilst it’s not standardised like robots.txt is today, llms.txt was a proposed standard written by a man named Ryan Howard.
In a nutshell, think of it as a summary of your site with good on-site references, covering what you do, what you're selling, what you're offering, etc. That's all it is, but it covers everything that it needs to in order for an LLM to then dig deeper.
Think of it as a first look at a house you’re thinking of buying. You can go in and get a gauge of it, but you're not answering questions about what's inside the walls or what the roof is made out of quite yet. That may be on your second visit, when you delve deeper and investigate further into what you might want to purchase.
Some people may say it's the new meta keywords, or it can be abused. I would counter-argue that by saying that everything can be abused in the world of the web. If you produce your own llms.txt file, of course, it will be subjective, but as long as you're honest and open in it and you're not trying to abuse the LLM to make it do something it shouldn't, then it's a legitimate thing to do.
There’s been a lot of discussion over whether or not it should be included, but the longer you discuss it, the less time you’re spending producing it. Unless you’ve got a huge site, it can take about 20 minutes to produce an llms.txt file.
In the free version of Yoast SEO, you can generate it automatically, so you don't have to do any editing whatsoever. If you put out 20 posts a day, it will dynamically update that.
Whether or not a lot of LLMs out there are using it is immaterial. I consider it insurance. If one LLM decides to start using it, everyone who has been criticising llms.txt will very quickly shut up. Meanwhile, everyone who's enabled it through Yoast SEO will already have it there.”
Why should SEOs work with MCPs (Model Context Protocols)?
“For an SEO, the best way of using MCPs is to connect data and be able to interact with it on a more conversational level – and be able to understand big data without being a super advanced data analyst.
You can ask it questions, it can analyse that data, and then it can bring it back to you in a natural language manner that you can then understand more and ask more questions about.
For example, you can connect Search Console’s MCP to Claude or Perplexity, and I think OpenAI do it now. It's like putting a USB stick into the hard drive of the LLM, and now it has all of that extra data.
Previously, I compared it to when they were trying to escape on the roof in the first Matrix film, and Trinity needed knowledge on how to operate a helicopter in order to escape, so the Nebuchadnezzar outside the Matrix uploaded all of that knowledge directly into her brain, and all of a sudden, she knew exactly how to operate that helicopter. It’s kind of like that.
You’re uploading all of that data and knowledge into a larger LLM so it can contextually understand that data and interact with you on it.
You can do more than just Search Console as well. Google Analytics now have one. I don't think Semrush do, but Ahrefs do. SISTRIX has one. Then, you can connect all of those things and ask it questions about multiple things. That’s a really good thing because it can also answer questions that are meant for people who are less data-focussed, in the marketing team or the content team. They may find it useful.
‘What was working this time last year during Black Friday?’, ‘What made the most conversions?’, ‘What were the biggest pages that were attracted this year that maybe I could use?’, ‘What didn't work?’ It can read the data and analyse it.
You can upload your own data as well. If you had a Shopify shop, you could export all of your orders, upload that as a document, and then ask it questions about the order, whilst correlating it with anything that the MCP is connected to.”
Is the need to be concise potentially a short-lived requirement, and longer content may come into fashion again because LLMs can make sense of so much data?
“It might, but long content can still be concise. If a highly detailed research paper is 20,000 words long, it’s not that 8,000 of them are fluff. It's all concise and structured and done intentionally.
White papers are a good example of keeping to an agenda, not diverting too much, and not waffling too much. It stays on track and stays in its lane. Again, that's ingested in a better way.
In the long term, things are unpredictable. I've had conversations with some people who are convincing me that the browser may die in the next few years. The browser is made to bridge a gap between the content and styling and the human who is looking at it. Now that a machine might become the primary audience looking at your site in the next couple of years, the idea of websites may change.
Think of a Kindle, and the Kindle file format. You can put 10,000 books onto this one device, and it outputs in a very flat way. It's got all the formatting to go with it, but they're just text files. That's essentially markdown. Apps and LLMs may turn into a kind of Kindle management service for the open web, where they pull information off on a more basic level and then merge that with the media you may have inside.
I don't think humans are going to be looking at websites in the same way that we have been. That's a shift that we're all going to have to adapt to in the next couple of years. This is theoretical (the browser may live for another 25 years), but I believe that the way we're ingesting content, searching for things, and discovering things is going to change in the longer term.”
Is there no ideal length for a piece of content?
“Consistency is best, in terms of what expectations the audience has, but there's no definitive answer. It depends on the message that you want to convey.
For example, long-form podcast interviews are a trend right now, and some of them are very long. There was a recent Lex Friedman interview that was 10 hours long. In my head, I'm thinking, how do you even sit and talk for 10 hours? That is a feat in itself. Then you've got to think, who's going to listen for those full 10 hours? Probably not everyone, unless you're very focussed on that interview.
Then you'll get summaries, or you'll get it chopped and changed. Now that there are chapters on YouTube, you may only be interested in certain segments of the interview. Also, LLMs will be able to take the transcript and the chapters, and soon they'll probably be able to understand things like mood and sarcasm.
Whilst a long-form podcast is kind of fluffy, because you can talk about all sorts, you also get unadulterated content. If we were talking for three hours, we'd go into much more granular detail. Again, it's about being concise and getting to the point, and LLMs will only get better at understanding the summary and which things are important – particularly once they start making more connections and understanding our personalities.
They will be able to determine which bits of that 10-hour interview with Lex Friedman are actually applicable to me. What am I going to find interesting based on what it knows, and what it’s seen in my browsing history and Gmail activity? That's going to be really cool, but it also makes it much harder for us as SEOs, because personalisation is the norm now.
When personalisation started, if you wanted to check the ranking of something, you would go to whatever the search term was on Google and just add the UCM ‘pws=0’, and that meant that personalisation was disabled, and you would see completely different rankings. People aren't talking about that because no one's got depersonalisation anymore.
Even on OpenAI, you can have zero account activity, you can be logged out, and you can ask the same thing as me, and we'll get a different result. That's how personalised it is. There's nothing flat anymore, and it makes it much harder to attribute for rankings. It makes it much harder for rank tracking to be accurate and valuable. Everything's up in the air for the next couple of years.”
Alex, what's the key takeaway from the tip you shared today?
“Stop making fluffy content. Stop scaling content when you don't need to. Make things that are concise. Make sure the formatting and structure of the piece are actually there.
Also, read more about journalistic editorial guidelines. Look at the guidelines that top publishers like the FT use when they're writing content, and get as high up in standards as possible.
I could write 500 words on the death of the browser, but don't force me to write 1,200 words for the sake of making it a bit longer. If I only have 500 words to say, then that's it. It's concise. It's well-structured. It has its points, and it answers some questions that can then be synthesised into an answer.”
Alex Moss is a Principal SEO at Yoast and Co-Founder of FireCask. Find out more over at Alex-Moss.co.uk.