index 

Artificial Intelligence and Signal Processing

AI as a programmer’s assistant

In 2022, the artificial intelligence research and deployment company OpenAI, introduced a conversational large-language model called “ChatGPT”, which they have trained on 8 billion pages of text, almost every book ever published, all of Wikipedia, and selected websites. (You can try it for no cost at https://chat.openai.com/.)  The current version, as of mid-2024, is ChatGPT-4o. Since that introduction, there have been many other competing chatbots, such as Google’s Gemini, Microsoft’s CoPilot, and Anthropic’s Claude. The strength of such chatbots is in language interpretation and writing. For example, chatbots like this are quite good at simple tasks like suggesting possible titles for papers, talks, or proposals that you are working on. Just feed it all or a portion of what you have written. They can also paraphrase and outline, which can be useful for writing condensed summaries or abstracts. Their knowledge base is extraordinarily wide and they can often answer very detailed questions about technical topics. For example, chatbots are often better than Google at answering specific questions about hardware or software programs that would otherwise force you to plow through pages and pages of documentation.

But what does it know about signal processing?  Here are some examples of queries related to signal processing, performed in December of 2022, about two weeks after its public availability: (1) reduce the noise in a digitally sampled signal vector; (2) detect the peaks in a digitally sampled signal vector, and (3) decrease the widths of overlapping peaks in a digitally sampled signal. In all these cases, ChatGPT gave very reasonable answers, furnished in well-phrased English. It’s interesting that if you ask the same query again, you’ll get a different answer, also reasonable, much as if you were to ask a different person the same question. This is very unlike Google, which simply displays a set of links to relevant websites and which gives the same response if you repeat the same search immediately.

But AIs can go well beyond that; they can write code in several languages commonly used by scientists, such Matlab, Python, Wolfram Mathematica, C/C++, R, Pascal, Fortran, HTML, JavaScript, Excel macros, etc. There are now several AI services especially oriented toward code development, including GitHub Copilot, Replit, Amazon CodeWhisperer, Codex, and Tabine.

Based on my limited testing, ChatGPT can generate working code in Matlab or Python for simple ap-plications, if you describe the task adequately. In January 2023, I performed a series of tests in which I asked ChatGPT to write code for several signal processing tasks that I had previously coded in Matlab. I found that ChatGPT’s code works for some simple processing tasks, if the description is sufficiently complete, but for more complex tasks, its code often does not work at all, or it does not do what you expect. It’s somewhat misleading that, even in cases where its code does not work, it is presented in good style, usually with explanatory comments for each line, appropriate indentation, examples of use, and even warning that the code may fail under certain circumstances (e.g., division by zero, etc.).

A simple example where ChatGPT works well is to write a "function that returns the index and the value of the element of vector x that is closest to the scalar val. If more than one element of x is equally close to val, it returns vectors of indices and values". This simple but useful function is performed by my val2ind.m function in Matlab. ChatGPT's code, prompted only by the above description in quotes, is functionally identical to my version but is better in terms of programming style: it breaks up my line 3 into its component parts, it includes explanatory comments for every line, and it even gives an example of use (although to be fair my code included three examples).


function [indices, values] = closest(x, val)

    % Find the differences between the elements of x and the scalar value val

    diffs = abs(x - val);

   

    % Find the minimum difference

    [minDiff, minIndex] = min(diffs);

   

    % Find the indices of the elements of x that have the minimum difference

    indices = find(diffs == minDiff);

   

    % Extract the corresponding values of x at those indices

    values = x(indices);

end

 

Note that I did not specify the name of the function in  my query, so ChatGPT chose to call it "closest", which is a perfectly good name. ChatGPT also adds this example of use:

 

ChatGPT> To use this function, you would call it like this:

 

x = [3, 4, 7, 11, 2, 9];

[indices, values] = closest(x, 8);

 

Another example of success is coding Caruana's Algorithm, which is a fast way to estimate the peak parameters of a noisy peak that is locally Gaussian near its peak. I asked ChatGPT to "write a function that accepts two vectors x and y approximating a digitally sampled peak, takes the natural log of y, fits a parabola to the resulting data, then computes the position, FWHM, and height of the Gaussian from the coefficients of the parabolic fit". In this case ChatGPT does all the required algebra and creates code that is functionally identical to my hand-coded version.


I asked both ChatGPT and Microsoft’s CoPilot to “Write a Matlab script that creates a plot with two horizontal subplots, the top one showing a signal with several peaks and some random noise, and the bottom one showing the results of applying a smooth of three different widths to that signal, each shown in a different line style and color, with a legend identifying each plot and its smooth width”.

This results in a working script that generates the graphic shown here, just as requested, although CoPilot included only one of the smooth widths in the legend. (Note that the chatbots are forced to make choices for several things that I did not specify exactly, including the shape of the peaks, the sampling rate and signal length, and the three smooth widths. Moreover, it adds titles to both subplots, even though I did not specify that detail). It is particularly handy that, if you want to generate a Python equivalent for example, you can simply say “How can I do that in Python” and it creates a working python script that imports the required libraries and generates an almost identical graphic. (The same may be true of the other languages that it knows, but I have not tested that). You can also feed it a program written in one of its languages and ask it to convert it into another of its languages.


Another query that that created well-structured, working code is : "…create a signal consisting of a noisy Gaussian peak, determine its peak height, position, and width by iterative curve fitting, repeat the process 10 times with the same Gaussian but with different independent samples of random noise, and then compute the mean and standard deviation of the peak heights, positions, and widths".  Matlab code. Python code.


Clearly asking Chatbot to perform routine tasks such as these is quick and convenient, especially if you are creating the equivalent code in more than one language; it spits out the code faster that I can type. For larger more complex projects, you could break up the code into functions that can be created separately and then combined later as needed. This is just what I would have expected from a human programmer’s assistant. ChatGPT always presents its results nicely formatted, with correct spelling and grammar, which many people interpret as a “confident attitude”. This inspires trust in the results, but just as for people, confident does not necessarily mean correct; there are several important caveats:

First, the code ChatGPT generates is not necessarily unique; if you ask it to repeat the task, you’ll often get a different code (unless the task is so simple that there is only one possible way to do it correctly). This is not necessarily a flaw; there is often more than one way to code a function for a particular purpose. For one thing, the code may require temporary variables to be defined within the function; the names of those functions will be chosen by ChatGPT and won’t always be the same from trial to trial.  Moreover, unless you specify the name of the function, ChatGPT will choose that as well. 

 

Second, and more important, there may be more than one way to interpret your request, unless you have very carefully worded it to be unambiguous. Take the example of data smoothing. On the face of it, this is a simple process. Suppose we ask for a function that performs a sliding average smooth of width n and applies it m times. How will that request be interpreted by ChatGPT? If you simply say “applies an n-point sliding average smooth to the y data and repeats it m times", you will get this code, which does what you asked but probably not what you want. The point of applying repeat smooths is to apply the smooth again to the previously smoothed data, not repeatedly to the original data, as this code does. The general rule is that n passes of a m-width smooth results in a center-weighted smooth of width n*m-n+1. You will get that if you ask for a function that “applies an n-point sliding average smooth to y and then repeats the smooth on the previously smoothed data m times", a small but critical difference, resulting in this code, whereas the previous code returns a singly-smoothed result no matter the value of m.


 

Third, there may be unspecified details or side effects that may required addressing, such as the expectation that the number of data points in the signal after processing should be the same as before processing.  In the case of smoothing, for example, there is the question of what to do about the first n and last n data points, for which there are not enough data points to compute a complete smooth. There is also the requirement that the smooth operation should not shift the x-axis positions of signal features, which is especially critical in scientific applications. Human-coded smooth algorithms, such as fastsmooth, consider all these details.

Another example of unspecified details is the measurement of the full width at half-maximum (FWHM) of smooth peaks of any shape. The function I wrote for that task is “halfwidth.m”. I used the description for that function as the ChatGPT query: “…a Matlab function that accepts a time series x,y, and computes the full width at half maximum of the peak that is nearest xo. If xo is omitted, it computes the halfwidth from the maximum y value”. The AI came up with “computeFWHM.m”, which works if the sampling rate is high enough. However, the AI’s version fails to interpolate between data points when the half-intensity points fall between the data points. This is demonstrated by the script “CompareFWHM.m”, which compares both functions on some synthetic data with adjustable sampling rate.

 

Another technical kink relates to the common Matlab practice saving functions that you have written as a separate file on disc and then later calling that saved function from a script that you are writing, relying on Matlab to find it in the path (Matlab R2016b or later). If you ask ChatGPT to convert that new script to another language, you must embed your external functions into the code.


A more challenging problem is iterative fitting of noisy overlapping peaks. The query was to “fit the sum of n Gaussian functions to the data given in x and y, given initial estimates of the positions and widths of each Gaussian. The function returns the best-fit position, width, and height of all the Gaussians). ChatGPT came up with an attractively-coded “iterativefitGaussians.m”. The closest hand-coded equivalent in my tool chest was “fitshape2.m”; both codes are about the same length and require similar input and output arguments. There is seldom a uniquely correct answer to iterative fitting problems, but the difference in performance between these two codes was instructive. The self-contained script “DemoiterativefitGaussians2.m” compares the two functions for a simulated signal with three noisy peaks whose true parameters are set in lines 13-15. For an “easy” test signal, with little peak overlap, both work well. But if the peaks overlap, ChatGPT’s function fails. The difference is probably the difference in the minimization function employed (lsqcurvefit.m vs fminsearch.m).


More complete iterative peak fitters, such as my peakfit.m or its interactive equivalent ipf.m
, are far more larger and more complex and have several thousand lines of code. They were developed incrementally over time and applied to many different kinds of signals, with lots of suggestions and corrections by users. Even describing such programs completely for a chatbot would be tedious at best. It’s unrealistic to expect any chatbot to completely replace such efforts, but the can be useful for writing individual modular functions and subsets of code.


ChatGPT can not write Matlab Live Scripts or Apps, because they are not text files. However, you can easily convert a regular script into a Live Script as described here.



The fact is that more thought and experience goes into hand-coded programs than AI generated ones. An experienced human coder knows the typical range of applications and anticipates typical problems and limitations, especially as those apply to the specific field of applications, such as signals generated by different types of scientific techniques. An AI, on the other hand, has a wider - and presumably shallower - range of knowledge. Of course, AIs know a great deal about specific computer languages and their capabilities and inbuilt functions, which can be very useful, but they are obviously no replacement for human experience. It goes without saying that you must test the code that ChatGPT gives you, as you must test your own code.


Another quite useful service that chatbots can perform is to recognize the function and purpose of parts of a program that you are trying to understand. For example, if you feed ChatGPT the short script sft.m without any comment, it will recognize it as a Fourier transform calculation. It will also recognize mathematical expressions and equations, such as V = S*(1 + R)^T


It seems likely that in the future, AI services such as ChatGPT will be much more capable and more widely available, probably on a subscription basis. Development is ongoing, and millions of users are already using the ChatGPT server with a free account or the subscription based ChatGPT 4o.

Further reading:

How to use ChatGPT to code any programming language


Using ChatGPT as Your Programming Assistant

How to use ChatGPT to write code: What it can and can't do for you




This page is part of "A Pragmatic Introduction to Signal Processing", created and maintained by Prof. Tom O'Haver , Department of Chemistry and Biochemistry, The University of Maryland at College Park. Comments, suggestions and questions should be directed to Prof. O'Haver at toh@umd.edu. Updated May, 2024.