My Porcine Aviation Era

I have not had great experiences with AI development tools. I’m not a Luddite, but if the tool takes me more time than just doing it by hand, and I get the same results, it’s not a tool I want to use. In some cases, the code had subtle bugs or logic little better than the naive implementation. Or in other cases, the code was not modular and well laid out. Or the autocomplete took more work to clean up than it saved in typing. And in some cases it would bring in dependencies, but the version numbers would date back from the early 2020s. They were out of date and didn’t match to current documentation. For the most part the code worked, but I knew if I accepted the code as is, I would open myself (or whoever followed me) to maintenance issues at some later date. One might argue that the AI would be much better then, and could do the yeoman’s work of making changes, but I wasn’t sold on that idea. (And I’m still skeptical).

I would turn on the AI features, use them for a while, but eventually turn them off. I found it helped me with libraries with which I wasn’t familiar. Give me a few working lines of code and a config to get me going, and I’ll fix it up. It would save me a search on the internet for an out-of-date Stack Overflow article, I guess. I used prompt files and tried to keep the requests narrow and focused. But sometimes, writing a hundred lines of markdown and a four sentence prompt to get a function, didn’t seem scalable.

Well, pigs are flying and I found something that appears to be working. First, it involves a specific product. At the time of writing Claude Opus/Sonnet 4.5 seem to be quite good. Second, I have a better way of incorporating prompt files in my workflow (more on that below). Third, language matters. I found Claude gave me the same problems listed above when working on a Jakarta EE code base. But Rust is great. (Rust also has the advantage of being strongly typed and eliminating some of the issues I’ve had when working with Python and LLMs). Fourth, I apply the advice about keeping changes small and focused. Fifth, I refactor the code to make it crisp and tight. (More on that below). Sixth, ask the LLM for a quick code review. (More on that below).

The first topic I’ll expand on is changing my relationship with prompt files. Instead of attempting to create prompt files for an existing code base, I started writing a prompt file for a brand new project. I had Claude generate a starter and then added my edits. I believe in design (at least enough to show you’ve thought about the problem). This actually dovetails with my need to think through the problem before coding. I still find writing prompt files for an existing code base tedious. But, if I have to sit down and think about my architecture and what pieces should do, the prompt file is as good a place as any.

The other thing I want to cover is refactoring what the LLM hath wrought. Claude generated serviceable code. It was on par with the examples provided for the Rust libraries I was using. (Which also happen to be very popular with plenty of on-line examples). Claude would have had access to rich training data and pulled in recent versions (although I had to help a little). But the code is not quite structured correctly. In this case I needed to move it out of the main function and into separate modules. But mostly it was cut and past and let the compiler tell me what’s broken. Next, in the refactor, is to minimize the publicly exposed elements. Now I have code that’s cohesive and loosely coupled. The LLM by itself does not do a great job at this. Taste is more a matter for meat minds than silicon minds at this stage.

The final thing I want to touch on is using the LLM to review the code after refactoring. This gives me another data point on the quality of my code and where I might have had blind spots during the refactoring. I work with lots of meat minds and we review each others’ code on every commit. There are some good reviews and there are some poor reviews. And reviewing code is harder, if you’re not familiar with the specific problem domain. But the machine can give a certain level of review prior to submitting the PR for team review.

So that’s what I’ve found works well so far: green-field projects, in highly popular frameworks and languages, performing design tasks in the prompt file, using LLM best practices, refactoring the code after it’s generated, and a code review before submitting the PR. Auto-complete is still off in the IDE. And I’ll see if this will scale well as code accumulates in the project. But for now, this seems to produce a product with which I am satisfied.

[A small addendum on the nature of invention and why I think this works].

Peoples’ ideas are not unique. As one of my relations by marriage pointed out years go, when he moved above 190th street in Manhattan, there seemed to be a sudden run to the tip of the island to find “affordable” housing. In a city of millions of people, even if very few people have the same idea at the same time, the demand quickly gobbles up the supply. Humans build ideas up mostly from the bits and pieces floating around in the ether. Even “revolutionary” ideas are often traced back to maybe a interesting recombination of existing ideas. Moreover, people have sometimes been doing that “revolutionary” thing before but didn’t connect it to some other need or didn’t bother repacking the idea. What’s more important about “ideas” is not the property of the idea but the execution of the idea.

There is still something about invention, even if it is largely derivative, that the LLMs don’t appear to posses. Nor do they have the ability to reason about problems from logical principles, as they are focused on the construction of language from statistical relationships. Some argue that enough statistics and you can approximate the logical reasoning as well as a human, but I haven’t seen solid examples to date. The LLM doesn’t understand what to create, but it does summarize all the relevant examples necessary to support the work of realizing the idea. But even then, there are gaps that we have to fill in with human intention. Does this revolutionize coding for me? No, I estimate it makes me some percentage more productive, but in the 5-15% range. And of the time and effort necessary to realize an idea, coding is maybe 1/3 or less of the effort. And I worry that we’ll never get to the point this technology will be available at a price point that makes the provider a profit while being affordable to most people. After all, there’s a limit to how much you would spend for a few percentage points of additional productivity.

Your Mind, Their Thoughts

How does a company that’s hemorrhaging money get to profitability, when they offer a free service? You can create tiers or pay walls to funnel users to paying. This model is popular in the SaaS world, where the free version is a loss leader for future sales. But it isn’t a suitable model for every service. The other avenue to monetization is to show advertisements. It isn’t black and white, with some paid services, like Hulu, still show advertisements. The degree to which advertising is permitted is the degree to which the consumers (businesses or individuals) push back on the advertising.

Strictly speaking, Google and Meta are communication service providers on the SP 500 index. Practically all their money comes from advertising and sponsored content. Amazon and Microsoft are also making significant money from advertising and sponsored content. Your feeds on services like Linked In, X, Facebook, Tik-Tok, YouTube and so on are becoming a gruel of actual content and advertisements, either direct ads through the platform or “creators'” own advertising. New and improved with AI slop to foster more interaction and create more engagement. More of our economy is based on putting an ad in front of someone’s eyeballs than you would imagine. It’s easy to spot some advertising, such as a commercial about a laxative in the middle of a Football game. It’s harder to spot other ads, such as an influencer that doesn’t disclose a payment for a “product review.” The adage that if you aren’t paying for it, you’re the product, is ever more true. Have you thought, for five minutes, how the startups offering free access to LLMs are going to make money?

After thinking about it, I realized companies like OpenAI are better positioned to make money than we realize. First, the injection of cash has turbo-charged their data gathering. There is more investor money to harvest more and more data. I suspect this is also where the first moats for legacy chat-bots will happen, inking deals with content companies. New entrants won’t have the pockets or the bandwidth to negotiate a bunch of little deals to avoid getting sued. But that’s another issue. They are hoovering up everything. There is plenty of evidence they, or their agents, are ignoring any ‘ROBOTS.TXT’ entries that disallow scraping. When actual regulation arrives, it serves more as regulatory capture than creating equitable payments to the sources of content.

Second, we have come to accept that they can wrap your prompt in their secret prompt. These additions to your prompt are hidden, to arguably prevent circumvention. The stated reason to inject those prompts is to prevent leaking dangerous information, such as how to make explosives. They are also part of your terms of service. Attempting to circumvent or discover the prompts is a basis for canceling your account. The account that has your obsequious, pleasant friend on which you’ve come to rely. The point is we are now comfortable, or happily oblivious to, our prompt being wrapped in additional hidden prompts. The easiest way to hide advertising is to keep promotional material secret, like the safety prompts. And to make it a violation of the terms of service to avoid promotional prompting, like the safety prompting. You may even be aware that there is promotional prompting in general, but a specific prompt.

Another way is to selectively return supporting links. For example, if you ask about camping, cold weather clothing, or places to visit in Maine, you might get a link to LL Bean. This is relatively harmless, except that it is different from search, where you can move past the initial results. There is a push for search engines to move from search results to AI results. That may mean, in the future, that you only get the handful of links from the paid advertisers along with the chat response. There may be no button to show more results, or you may have to explicitly ask for more results. Combine that with the advertiser’s ability to modify the hidden prompts injected along with your prompt, and you might lose any awareness of other possibilities. And should the LLM lie about one retailer having the best price, or a particularly well-suited product, that’s chalked up to the hallucinations.

There is also the information you are divulging about yourself. Maybe you are spewing information you would never share on Facebook or even Google Search. For free users, the AI companies are likely to mine all prior conversations, building up a detailed profile. For paid users, mining may depend on the plan and the account, such as a corporate account versus an individual premium account. This is already happening through other social media, but the LLMs may have more detailed information about mental state or health. While it may be more a difference of degree than kind, the chats may have richer data. I suspect the need for vast amounts of storage is to handle the influx and processing of the data you are freely giving them about your internal emotional and psychological state.

What I fear, and may be more deeply concerning, invoving the ability of the LLM to prime you over time. In some sense, search is “one shot.” You type in a search, you get back results. Facebook and other social feeds have been shows to influence peoples’ opinion not on just products, but able to alter their mental health. Their advertising can be better concealed. You might have retweeted or re-posted what were ads in the past. To a degree people have unmasked some of the behavior. We might be more inured to it now, and therefore have a bit of a resistance, but the social media algorithmic rabbit hole is alive and well. We know to watch for “radicalizing” content. What we don’t know how to spot are radicalizing outputs from a chat bot.

LLMs and chat bots may catch us in a particularly vulnerable way. We have a bias to believe the computer’s response is a neutral, disinterested party. And the responses from the LLM are private and highly individual. Not like public feeds on various Apps. If a company that sees sufficient lifetime value in a customer, they may be willing to pay over multiple chats. Maybe a $100 for a couple of months of ‘pushing.’ Imagine if the opioid vendors had access to this technology. Paying a few dollars to push someone toward a prescription for their brand of opiate may be worth thousands of dollars per patient. And each future addict’s chats are essentially customized to that person. Remember, we have plenty of evidence that existing social media can shape opinion and even mental health. Show enough people “PSA” style ads about enough vague symptoms and people will, in fact, ask their doctor if that drug is right for them.

But the big hook is the outsourcing of your cognition. Human beings are inherently lazy. If an escalator is present, almost no-one takes the stairs. Go to the airport and watch people, without luggage, queue for the escalator. The stairs are almost empty and there is just one flight. But they will wait in a press of people. Having a tool that allows you to ‘just get the answer,’ is like your brain being given the option to take the escalator. Instead of thinking through even simple problems, you just pose the prompt to the chat bot. And just like muscle gets soft and atrophies with disuse, your ability to solve problems dwindles. It’s like the person who begins to take the escalator not because it’s a little easier, but because they are now winded when taking the stairs. Need a plan for a workout? This shouldn’t be that hard, but you can just ask the LLM. (Ignoring it may actually give you bad advice, or in a world of sponsored chats, push you toward products and services you don’t need). Need a date idea? Just ask the LLM. Is your back pain something to worry about? The LLM has a short answer.

At least reading search results might inadvertently expose you to a knowledgeable and objective opinion between ads. If I search on Google for US passport applications, the first link is actually a sponsored link to a company that will collect all my data and submit my passport application for me. Who is this company? I’ve never heard of them. It ends in a “.us” domain, making it seem US related, but who knows what they do with the data or how they store it. The second link is the state department, but the third link is not. The only reason the state department is there, is because they paid to sponsor a search result. But at least it’s there. And it’s also in the list of general results. Google, Facebook, Tik-Tok, and so on have a track record of taking advertiser money from almost anyone. Amazon’s sponsored content is sometimes for knock-off or counterfeit products. And some sites have absolutely no scruples on the ads they serve, ads which might originate from Google or Meta ad services.

The lack of scruples or selectivity demonstrated by other on-line services that take advertising, combined with the outsourcing of cognition, means you are exposing yourself to some of the shittiest people on the face of the earth. For every time you are pushed toward buying a Honda, you might also be pushed toward taking a supplement that is dangerous to your health. You will likely be unaware you are being marketed to, and in ways that are completely personal and uniquely effective on your psyche. In a state of mind where you’re being trained to expect an objective result, with additional prompts that are invisible to you for “safety,” and a technology whose operation is inscrutable, you have no idea why you are provided with a given answer. Is it your idea not to buy a car at all and just use ride share services every day? If the ride share services want the behavior to stick, they know it needs to feel like it was your idea. Is it your idea to really push your doctor for a Viagra prescription, even though you are an otherwise healthy, 24 year old male? You shouldn’t but those symptoms come to mind…

The possibilities for political advertising and opinion shaping are staggering. The LLM expected to give neutral answers is sponsored to return “right leaning” or “left leaning” answers for months before an election. Or it embeds language also used by framers of important electoral issues, to prime you for other messaging. Unlike the one-shot advertising in a search result, or the obvious ad on the page you ignore, the LLM is now doing your thinking for you. There will be people who will take the mental stairs because they know the LLM dulls their wits. But these will be fewer and fewer as LLMs get better and more common. With no evidence that on line advertisers find any customer objectionable, could Nick Fuentes be paying to inject your responses with pro-fascist content?

It will be impossible for you to determine what ideas are a product of your reason and research. You will still feel like you’re in control. You will still have your mind. But what goes through your mind will be even more carefully and accurately shaped. In a state were a few thousand votes can sway an election, how much would a campaign pay to advertise to specific voters, if they start seeing those voters adopt talking points and slogans from their LLM chats and social media posts? Would it be $500 per voter? Maybe you need to target 50,000 voters at a total cost of $25,000,000? That actually seems affordable, given the vast sums that are spent on some close elections. The free chat bot loses money. The “premium” plan at $20 per month loses money. Even the $200 a month plan loses money. But the advertising may be their pay-day. How much would you pay to get people to think the way you want them to think, each person believing this was the natural evolution of their own thinking. Casually using LLMs is essentially opening your mind to think other peoples’ thoughts.

Yes, But It’s not COBOL

Articles like these point to a multi-decade old language when something fails. Sometimes they don’t even wait for the post-mortem. There’s COBOL involved, so it must be COBOL. It’s old, right? First, let’s get one thing out of the way, and that’s the implication that the language is 60+ years old, so the computer it’s running on is old, right? No, it’s likely running on a modern IBM mainframe with modern tools. IBM makes the promise that if you write a mainframe program today, you can run it on future mainframes, without modification. That’s great for business customers because re-writing working software is expensive and time-consuming. These are highly reliable machines that are intended to run with down-time measured in seconds per year.

But the software failed, right? Because it’s old. That is complete balderdash. If you write a correct program today, it will continue to be a correct program 100 or 1,000, or 10,000 years from now. If you have an interest rate, and an amount, and compound that over a period of years, that answer won’t change. Because the program itself is applied math and logic. The rules of logic and math don’t change over time. Time itself isn’t the issue. What is the issue?

The issue comes back to maintenance. If I write a program that works today, it may not be completely correct. There may be bugs. Those need to be fixed and the effort I put toward fixing the bug impacts the long-term stability of the program. If fixing a bug is done under the gun, or on the cheap, it might cause what’s called “code entropy.” Code entropy is the de-evolution of a well written program into crap. Sometimes the bug fix must be rushed through, as customers are losing money by the minute. After that, we should go back and do a broader fix to the software. That may mean making changes to the underlying logic or other parts of the program. In doing so, we minimize the code entropy problem. But that maintenance cost money.

The next reason for maintenance is a change in requirements. This is especially true for systems that change every time there’s a change in the law. In some cases these changes are retro-active. This creates a lot of churn on short time-lines, and like bug fixes, and also results in code-entropy. The quick fix is rarely followed by the work to refactor the existing code, accordingly. The software entropy increases and the code becomes even harder to fix with the next change. Re-architecture of the old code costs money. Most places just indefinitely defer the maintenance on their old COBOL code.

Many commercial, private sector, companies rely on COBOL for high-volume transaction processing. It has many features more modern languages lack, like it’s English-like structure is legible to non-programming auditors. And modern features have been added to it, even if they have not been adopted by organizations using COBOL (especially the ones likely to skimp on maintenance). But it is not a truly modern language like Rust or Go, or even a middle-aged language like Java. And it exists in a specific computing environment (the mainframe) which is kept artificially expensive thanks to its monopoly supplier and small customer base. Getting trained on mainframe operations isn’t cheap and many companies don’t want to pay for it, as their newly trained people will leave for better offers.

Many of the problems people associate with COBOL are going to re-appear (and have re-appeared) when companies move to platforms like Java. I have been at many sites were Java programs on old, unsupported versions of Java are being poorly maintained. Or running on old, out of support application servers (programs that run Java code on servers). Databases that are so old and out of date that either the vendor has gone out of business or there is no longer a way to upgrade their database to the current versions. When out of date, poorly written Java code crashes, it just becomes a generic, bland, IT failure and mismanagement. But, because it doesn’t involve COBOL, it doesn’t get the headlines that are cheap and easy to score with an old language.

The biggest counter-example of “because COBOL” is the number of banks, brokerages, exchanges, payment processors, and insurance companies that quietly process about 80% of the worlds financial transactions on a daily basis. They have an incentive to perform routine maintenance. They have also quietly off-shored their software maintenance over the last few decades to places where a COBOL coding jobs is a good job. Offer most US software engineers a COBOL job and they will turn their nose up and assume you were joking. But in India, the Philippines, and China, COBOL is not the scarlet A that it is in the West.

I want to address something specific about the article posted above. It stated that because COBOL is by its nature defective or tool old, it cost the US 40 billion in GDP. That sounds like a lot, but in an economy generating trillions of activity, it is a rounding error. Second, re-writing that code has its own costs. That could be even more billions spent getting exactly to the same level of service provided today. There probably isn’t enough money in the world to re-write the existing, mission critical COBOL code into something else. That will take away from other budgets and, if not maintained, will result in the same problem just 10, 20, or 30 years in the future. And where will publications like FT get cheap headlines in the future, if COBOL goes away?