Wed. Feb 26th, 2020

GNU Awk 5.0 released: GNU awk programming language

2 min read

GNU Awk (Gawk) is a programming language for processing text and data under Linux/UnixData can come from the output of standard input, files, or other commands. It supports features such as user-defined functions and dynamic regular expressions.

The basic function of awk is to search files for lines (or other units of text) that contain certain patterns. When a line matches one of the patterns, awk performs specified actions on that line. awk continues to process input lines in this way until it reaches the end of the input files.

Programs in awk are different from programs in most other languages because awk programs are data-driven (i.e., you describe the data you want to work with and then what to do when you find it). Most other languages are procedural; you have to describe, in great detail, every step the program should take. When working with procedural languages, it is usually much harder to clearly describe the data your program will process. For this reason, awk programs are often refreshingly easy to read and write.

When you run awk, you specify an awk program that tells awk what to do. The program consists of a series of rules. Each rule specifies one pattern to search for and one action to perform upon finding the pattern.

Changelog v5.0

1. Support for the POSIX standard %a and %A printf formats has been added.

2. The test infrastructure has been greatly improved, simplifying the contents of test/ and making it possible to generate pc/Makefile.tst from test/

3. The regex routines have been replaced with those from GNULIB, allowing me to stop carrying forward decades of changes against the original ones from GLIBC.

4. Infrastructure upgrades: Bison 3.3, Automake 1.16.1, Gettext, makeinfo 6.5.

5. The undocumented configure option and code that enabled the use of non-English “letters” in identifiers is now gone.

6. The `–with-whiny-user-strftime’ configuration option is now gone.

7. The code now makes some stronger assumptions about a C99 environment.

8. PROCINFO[“platform”] yields a string indicating the platform for which gawk was compiled.

9. Writing to elements of SYMTAB that are not variable names now causes a fatal error. THIS CHANGES BEHAVIOR.

10. Comment handling in the pretty-printer has been reworked almost completely from scratch. As a result, comments in many corner cases that were previously lost are now included in the formatted output.

11. Namespaces have been implemented! See the manual. One consequence of this is that files included with -i, read with -f, and command line program segments must all be self-contained syntactic units. E.g., you can no
longer do something like this:

gawk -e ‘BEGIN {‘ -e ‘print “hello” }’

12. Gawk now uses the locale settings for ignoring case in single byte locales, instead of hardwiring in Latin-1.

13. A number of bugs, some of them quite significant, have been fixed. See the ChangeLog for details.