The Future of Coding: Will Generative AI Make Programmers Obsolete?

Table of Content

  1. Is coding still worth learning in 2024?
  2. Is AI replacing software engineers?
  3. Impact of AI on software engineering
  4. The problem with AI-generated code
  5. How AI can help software engineers
  6. Does AI really make you code faster?
  7. Can one AI-powered engineer do the work of many?
  8. Future of Software Engineering
  9. Reference
Credits: this post is a notebook of the key points from YouTube Content Creator Programming with Mosh's video with some editorial works. TL,DR,: watch the video.

Is coding still worth learning in 2024?

This can be a common question for a lot of people especially the younger generation of students when they try to choose a career path with some kind of insurance for future incomings.

People are worried that AI is going to replace software engineers, or any engineer related to coding and designs.

As you know, we should trust solid data instead of media and hearsay in the digital area. Social media have been creating this anxious feeling that every job is going to collapse because of AI. Coding has no future.

But I’ve got a different take backed up by real-world numbers as follows.

Note: In this post, “software engineer” represents all groups of coders (data engineer, data analyst, data scientist, machine learning engineer, frontend/backend/full-stack developers, programmers and researchers).

Is AI replacing software engineers?

The short answer is NO.

But there is a lot of fear about AI replacing coders. Headling scream robots taking over jobs and it can be overwhelming. But the truth is:

AI is not going to take you jobs; instead it is the People who can work with AI will have the advantage, and probabley will take your job.

Software engineering is not going away at least not anytime soon in our generation. Here are some data to back this up.

The US Bureau of Labor and Statistics (BLS) is a government agency that tracks job growth across the country on its website. From the data, we see that there is a continued demand for software developers, and computer and information scientists.

They claimed that the requirement for software developers is expected to grow by 26% from 2022 to 2032, while the average across all occupations is only 3%. This is a strong indication that software engineering is here to stay.

Source: https://www.bls.gov/ooh/computer-and-information-technology/software-developers.htm#tab-6

In our lives, the research and development conducted by computer and information research scientists turn ideas into technology. As demand for new and better technology grows, demand for computer and information research scientists will grow as well.

There is a similar trend for Computer and Information Research Scientists, which is expected to grow by 23% from 2022 to 2032.

source: https://www.bls.gov/ooh/computer-and-information-technology/computer-and-information-research-scientists.htm#tab-6

Impact of AI on software engineering

To better understand the impact of AI on software engineering, let’s do a quick revisit of the history of programming.

In the early days of programming, engineers wrote codes in a way that only the computer understood. Then, we create compilers, we can program in a human-readable language like C++ and Jave without worrying about how the code should eventually get converted into zeros and ones, and where it will get stored in the memory.

Here is the fact

Compilers did not replace programmers. They made them more efficient!

Since then we have built so many software applications and totally changed the world.

The problem with AI-generated code

AI will likely do the same as changing the future, we will be able to delegate routine and repetitive coding tasks to AI, so we can focus on complex problem-solving, design and innovation.

This will allow us to build more sophisticated software applications most people can not even imagine today. But even then, just because AI can generate code doesn’t mean we can or we should delegate the entire coding aspect of software development to AI because

AI-Generated Code is Lower-Quality, we still need to review and refine it before using it in the production.

In fact, there is a study to support this: Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality. According to this study, they collected 153M lines of code from 2020 to 2023 and found disconcerting trends for maintainability: Code churn will be doubled in 2024.

source: Abstract of the 2023 Data Shows Downward Pressure on
Code Quality

So, yes, we can produce more code with AI. but

More Code != Better Code

Humans should always review and refine AI-generated code for quality and security before deploying it to production. That means all the coding skills that software engineer currently has will continue to stay relevant in the future.

You still need the knowledge of data structure and algorithms programming languages and their tricky parts, tools and frameworks, you still need to have all that knowledge to review and refine the AI-generated code, you will just spend less time typing it into the computer.

So anyone telling you that you can use natural language to build software without understanding anything about coding is out of touch with the reality of software engineering (or he is trying to sell you something, i.e., GPUs).

source: NVIDIA CEO: No Need To Learn Coding, Anybody Can Be A Programmer With Technology

How AI can help software engineers

Of course, you can make a dummy app with AI in minutes, but this is not the same kind of software that runs our banks, transportation, healthcare, security and more. These are the software/systems that really matter, and our life depends on them. We can’t let a code monkey talk to a chatbot in English and get that software built. At least, this will not happen in our lifetime.

In the future, we will probably spend more time designing new features and products with AI instead of writing boilerplate code. We will likely delegate aspects of coding to AI, but this doesn’t mean we don’t need to learn to code.

As a software engineer or any coding practitioner, you will always need to review what AI generates and refine it either by hand or by guiding the AI to improve the code.

Keep in mind that Coding is only one small part of a software engineer’s job, we often spend most of our time talking to people, understanding requirements, writing stories, discussing software/system architecture, etc.

Instead of being worried about AI, I’m more concerned about Human Intelligence!

Does AI really make you code faster?

AI can only boost our programming productivity but not necessarily the overall productivity.

In fact, McKinsey’s report, Unleashing Developer Productivity with Generative AI, found that for highly complex tasks developers saw less than 10% improvement in their speed with generative AI supports.

source: https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/unleashing-developer-productivity-with-generative-ai

As you can see, AI helped the most with documentation and code generation to some extent, but when moving to code refactoring, the improvement dropped to 20% and for high-complexity tasks, it was less than 10%.

 Time savings shrank to less than 10 percent on tasks that developers deemed high in complexity due to, for example, their lack of familiarity with a necessary programming framework.

Thus, if anyone tells you that software engineers will be obsolete in 5 years, they are either ignorant or trying to sell you something.

In fact, some studies tell that the role of software engineers (coders) may become more valuable as they will be needed to develop, manage and maintain these AI systems.

They (software engineers) need to understand all the complexity of building software and use AI to boost their productivity.

Can one AI-powered engineer do the work of many?

Now, people are worried that one Senior Engineer can simply use AI to replace many Engineers, eventually, leaving no job opportunities for juniors.

But again this is a fallacy because the time saving you get from AI is not as great as you are promised in reality. Anyone who uses AI to generate code knows that. It takes effort to get the right prompts for usable results, and the code still needs polishing.

Thus, it is not like one engineer will suddenly have so much free time to do the job of many people.

But you may ask, this is now, what about the future? Maybe in a year or two, AI will start to build software like a human.

In theory, yes, AI is advancing and one day it may even reach and surpass human intelligence. But Einstein said:

In Theory, Theory and Practice are the Same.

In Practice, they are NOT.

The reality is that while machines may be able to handle repetitive and routine tasks, human creativity and expertise will still be necessary for developing complex solutions and strategies.

Software engineering will be extremely important over the next several decades. I don’t think it is going away in the future, but I do believe it will change.

Future of Software Engineering

Software powers our world and that will not change anytime soon.

In future, we have to learn how to input the right prompt into our AI tools to get the expected result. This is not an easy skill to develop, it requires problem-solving capability as well as programming knowledge of languages and tools. So, if you’ve already made up your mind and don’t want to invest your time in software engineering or coding. That’s perfectly fine. Follow your passion!

The coding tools will evolve as they always do, but the true coding skill lies in learning and adapting. The future engineer needs today’s coding skills and a good understanding to use AI effectively. The future brings more complexity and demands more knowledge and adaptability from software engineers.

If you like building things with code, and if the idea of shaping the future with technology gets you excited, don’t let negativity and fear of Gen-AIs hold you back.

Reference

Prompt Engineering for LLM

2024-Feb-04: 1st Version

  1. Introduction
  2. Basic Prompting
    1. Zero-shot
    2. Few-shot
    3. Hallucination
  3. Perfect Prompt Formula for ChatBots
  4. RAG, CoT, ReACT, SASE, DSP …
    1. RAG: Retrieval-Augmented Generation
    2. CoT: Chain-of-Thought
    3. Self-Ask + Search Engine
    4. ReAct: Reasoning and Acting
    5. DSP: Directional Stimulus Prompting
  5. Summary and Conclusion
  6. Reference
Prompt engineering is like adjusting audio without opening the equipment.

Introduction

Prompt Engineering, also known as In-Context Prompting, refers to methods for communicating with a Large Language Model (LLM) like GPT (Generative Pre-trained Transformer) to manipulate/steer its behaviour for expected outcomes without updating, retraining or fine-tuning the model weights. 

Researchers, developers, or users may engage in prompt engineering to instruct a model for specific tasks, improve the model’s performance, or adapt it to better understand and respond to particular inputs. It is an empirical science and the effect of prompt engineering methods can vary a lot among models, thus requiring heavy experimentation and heuristics.

This post only focuses on prompt engineering for autoregressive language models, so nothing with image generation or multimodality models.

Basic Prompting

Zero-shot and few-shot learning are the two most basic approaches for prompting the model, pioneered by many LLM papers and commonly used for benchmarking LLM performance. That is to say, Zero-shot and few-shot testing are scenarios used to evaluate the performance of large language models (LLMs) in handling tasks with little or no training data. Here are examples for both:

Zero-shot

Zero-shot learning simply feeds the task text to the model and asks for results.

Scenario: Text Completion (Please try the following input in ChatGPT or Google Bard)

Input:

Task: Complete the following sentence:

Input: The capital of France is ____________.

Output (ChatGPT / Bard):

Output: The capital of France is Paris.

Few-shot

Few-shot learning presents a set of high-quality demonstrations, each consisting of both input and desired output, on the target task. As the model first sees good examples, it can better understand human intention and criteria for what kinds of answers are wanted. Therefore, few-shot learning often leads to better performance than zero-shot. However, it comes at the cost of more token consumption and may hit the context length limit when the input and output text are long.

Scenario: Text Classification

Input:

Task: Classify movie reviews as positive or negative.

Examples:
Review 1: This movie was amazing! The acting was superb.
Sentiment: Positive
Review 2: I couldn't stand this film. The plot was confusing.
Sentiment: Negative

Question:
Review: I'll bet the video game is a lot more fun than the film.
Sentiment:____

Output

Sentiment: Negative

Many studies have explored the construction of in-context examples to maximize performance. They observed that the choice of prompt format, training examples, and the order of the examples can significantly impact performance, ranging from near-random guesses to near-state-of-the-art performance.

Hallucination

In the context of Large Language Models (LLMs), hallucination refers to a situation where the model generates outputs that are incorrect or not grounded in reality. A hallucination occurs when the model produces information that seems plausible or coherent but is actually not accurate or supported by the input data.

For example, in a language generation task, if a model is asked to provide information about a topic and it generates details that are not factually correct or have no basis in the training data, it can be considered as hallucination. This phenomenon is a concern in natural language processing because it can lead to the generation of misleading or false information.

Addressing hallucination in LLMs is a challenging task, and researchers are actively working on developing methods to improve the models’ accuracy and reliability. Techniques such as fine-tuning, prompt engineering, and designing more specific evaluation metrics are among the approaches used to mitigate hallucination in language models.

Perfect Prompt Formula for ChatBots

For personal daily documenting work such as text generation, there are six key components making up the perfect formula for ChatGPT and Google Bard:

Task, Context, Exemplars, Persona, Format, and Tone.

Prompt Formula for ChatBots
  1. The Task sentence needs to articulate the end goal and start with an action verb.
  2. Use three guiding questions to help structure relevant and sufficient Context.
  3. Exemplars can drastically improve the quality of the output by giving specific examples for the AI to reference.
  4. For Persona, think of who you would ideally want the AI to be in the given task situation.
  5. Visualizing your desired end result will let you know what format to use in your prompt.
  6. And you can actually use ChatGPT to generate a list of Tone keywords for you to use!
Example from Jeff Su: Master the Perfect ChatGPT Prompt Formula 

RAG, CoT, ReACT, SASE, DSP …

If you are ever curious about what the heck are those techies talking about with the above words? Please continues …

OK, so here’s the deal. We’re diving into the world of academia, talking about machine learning and large language models in the computer science and engineering domains. I’ll try to explain it in a simple way, but you can always dig deeper into these topics elsewhere.

RAG: Retrieval-Augmented Generation

RAG (Retrieval-Augmented Generation): RAG typically refers to a model that combines both retrieval and generation approaches. It might use a retrieval mechanism to retrieve relevant information from a database or knowledge base and then generate a response based on that retrieved information. In real applications, the users’ input and the model’s output will be pre/post-processed to follow certain rules and obey laws and regulations.

RAG: Retrieval-Augmented Generation

Here is a simplified example of using a Retrieval-Augmented Generation (RAG) model for a question-answering task. In this example, we’ll use a system that retrieves relevant passages from a knowledge base and generates an answer based on that retrieved information.

Input:

User Query: What are the symptoms of COVID-19?

Knowledge Base:

1. Title: Symptoms of COVID-19
Content: COVID-19 symptoms include fever, cough, shortness of breath, fatigue, body aches, loss of taste or smell, sore throat, etc.

2. Title: Prevention measures for COVID-19
Content: To prevent the spread of COVID-19, it's important to wash hands regularly, wear masks, practice social distancing, and get vaccinated.

3. Title: COVID-19 Treatment
Content: COVID-19 treatment involves rest, hydration, and in severe cases, hospitalization may be required.

RAG Model Output:

Generated Answer: 

The symptoms of COVID-19 include fever, cough, shortness of breath, fatigue, body aches, etc.

Remark: ChatGPT 3.5 will give basic results like the above. But, Google Bard will provide extra resources like CDC links and other sources it gets from the Search Engines. We could guess Google used a different framework to OpenAI.

CoT: Chain-of-Thought

Chain-of-thought (CoT) prompting (Wei et al. 2022) generates a sequence of short sentences to describe reasoning logics step by step, known as reasoning chains or rationales, to eventually lead to the final answer.

The benefit of CoT is more pronounced for complicated reasoning tasks while using large models (e.g. with more than 50B parameters). Simple tasks only benefit slightly from CoT prompting.

Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, essentially creating a tree structure. The search process can be BFS or DFS while each state is evaluated by a classifier (via a prompt) or majority vote.

CoT : Chain-of-Thought and ToT: Tree-of-Thought

Self-Ask + Search Engine

Self-Ask (Press et al. 2022) is a method to repeatedly prompt the model to ask follow-up questions to construct the thought process iteratively. Follow-up questions can be answered by search engine results.

Self-Ask+Search Engine Example

ReAct: Reasoning and Acting

ReAct (Reason + Act; Yao et al. 2023) combines iterative CoT prompting with queries to Wikipedia APIs to search for relevant entities and content and then add it back into the context.

In each trajectory consists of multiple thought-action-observation steps (i.e. dense thought), where free-form thoughts are used for various purposes.

Example of ReAct from pp18.(Reason + Act; Yao et al. 2023)
ReAct: Reasoning and Acting

Specifically, from the paper, the authors use a combination of thoughts that decompose questions (“I need to search x, find y, then find z”), extract information from Wikipedia observations (“x was started in 1844”, “The paragraph does not tell x”), perform commonsense (“x is not y, so z must instead be…”) or arithmetic reasoning (“1844 < 1989”), guide search reformulation (“maybe I can search/lookup x instead”), and synthesize the final answer (“…so the answer is x”).

DSP: Directional Stimulus Prompting

Directional Stimulus Prompting (DSP, Z. Li 2023), is a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.  Instead of directly adjusting LLMs, this method employs a small tunable policy model to generate an auxiliary directional stimulus (hints) prompt for each input instance. 

DSP: Directional Stimulus Prompting

Summary and Conclusion

Prompt engineering involves carefully crafting these prompts to achieve desired results. It can include experimenting with different phrasings, structures, and strategies to elicit the desired information or responses from the model. This process is crucial because the performance of language models can be sensitive to how prompts are formulated.

I believe a lot of researchers will agree with me. Some prompt engineering papers don’t need to be 8 pages long. They could explain the important points in just a few lines and use the rest for benchmarking. 

As researchers and developers delve further into the realms of prompt engineering, they continue to push the boundaries of what these sophisticated models can achieve.

To achieve this, it’s important to create a user-friendly LLM benchmarking system that many people will use. Developing better methods for creating prompts will help advance language models and improve how we use LLMs. These efforts will have a big impact on natural language processing and related fields.

Reference

  1. Weng, Lilian. (Mar 2023). Prompt Engineering. Lil’Log.
  2. IBM (Jan 2024) 4 Methods of Prompt Engineering
  3. Jeff Su (Aug 2023) Master the Perfect ChatGPT Prompt Formula