Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc. Copying and distribution of this file, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved. Changes from 3.1.5 to 3.1.6 --------------------------- 1. `gawk 'program' /non/existant/file' no longer core dumps. 2. Too many people the world over have complained about gawk's use of the locale's decimal point for parsing input data instead of the traditional period. So, even though gawk was being nicely standards-compliant, in a Triumph For The Users, gawk now only uses the locale's decimal point if --posix is supplied or if POSIXLY_CORRECT is set. It is the sincere hope that this change will eliminate this FAQ from being asked. 3. `gawk -v BINMODE=1 ...' works again. 4. Internal file names like `/dev/user' now work again. (Note that these file names are obsolete and will go away eventually.) 5. Problems with wide strings in non "C" locales have been straightened out everywhere. (At least, we think so.) 6. Use of `ansi2knr' is no longer supported. Please use an ANSI C compiler. 7. Updated to Autoconf 2.61, Automake 1.10, and Gettext 0.16.1. 8. The getopt* and regex* files were synchronized with current GLIBC CVS. See the ChangeLog for the versions and minor edits made. 9. There are additional --lint-old warnings. 10. Gawk now uses getaddrinfo(3) to look up names and IP addresses. This allows the use of an IPv6 format address and paves the way for eventual addition of `/inet6/...' and `/inet4/...' hostnames. 11. We believe gawk to now be valgrind clean. At least when run against the test suite. 12. A number of issues dealing with the formatting and printing of very large numbers in integer formats have been dealt with and fixed. 13. Gawk now converts "+inf", "-inf", "+nan" and "-nan" into the corresponding magic IEEE floating point values. Only those strings (case independent) work. With --posix, gawk calls the system strtod directly. You asked for it, you got it, you deal with it. 14. Defining YYDEBUG enables the -D command line option. 15. Gawk should now work out of the box on Tandem NSK/OSS systems. 16. Lint messages rationalized: many more of the messages are now printed only once, instead of every time they are encountered. 17. The strftime() function now accepts an optional third argument, which if non-zero or non-null, indicates that the time should be formatted as UTC instead of as local time. 18. The precedence of concatenation and `| getline' (in something like "echo " "date" | getline stuff) has been reverted to the earlier behavior and now once again matches Unix awk. 19. New configure time flag --disable-directories-fatal which causes gawk to silently skip directories on the command line. This behavior is also enabled for --traditional, since it's what Unix awk does. 20. A new option, --use-lc-numeric, forces use of the locale's decimal point without the rest of the draconian restrictions imposed by --posix. This softens somewhat the stance taken in item #2. 21. Everything relevant has been updated to the GPL 3. 22. Array growth should be faster now, at no cost in space. 23. Lots more tests. 24. One new translation. 25. Various bugs fixed, see the ChangeLog for details. Changes from 3.1.4 to 3.1.5 --------------------------- 1. The random() suite has been updated to a current FreeBSD version, which works on systems with > 32-bit ints. 2. A new option, `--exec' has been added. It's like -f but ends option processing. It also disables `x=y' variable assignments, but not -v. It's needed mainly for CGI scripts, so that source code can't be passed in as part of the URL. 3. dfa.[ch] have been synced with GNU grep development. This also fixes multiple regex matching problems in multibyte locales. 4. Updated to Automake 1.9.5. 5. Updated to Bison 2.0. 6. The getopt* and regex* files were synchronized with current GLIBC CVS. See the ChangeLog for the versions and minor edits made. 7. `configure --disable-nls' now disables just gawk's own translations. Gawk continues to work with the locale's numeric formatting. This includes a bug fix in handling the printf ' flag (e.g., %'d). 8. Gawk is now multibyte aware. This means that index(), length(), substr() and match() all work in terms of characters, not bytes. 9. Gawk is now smarter about parsing numeric constants in corner cases. 11. Not closing open redirections no longer causes gawk to exit non-zero. 10. The VMS port has been updated. 11. Changes from Andrew Schorr at the xmlgawk project to provide for open hooks from extensions are now included. This will let the xmlgawk extension work in the standard gawk. 12. Updated to gettext 0.14.4. Gawk no longer includes its own copy of the gettext `intl' library, following current GNU practice to rely on there being an external version thereof. 13. A regexp of the form `//' will now generate a warning that it is not a C++ comment from --lint (awk.y). 14. The ^ and ^= operators with an integer exponent now use Exponentiation by Squaring. This simultaneously fixes a problem with ^= and a negative integer exponent. 15. length(array) now returns the number of elements in the array. This is is a non-standard extension that will fail in POSIX mode. 16. Carriage return characters are now ignored in program source code. 17. Four new translations added. 18. Various minor bugs fixed. See the ChangeLog for the details. Changes from 3.1.3 to 3.1.4 --------------------------- 1. Gawk now supports the POSIX %F format, falling back to %f if the local system printf doesn't handle it. 2. Gawk now supports the ' flag in printf. E.g., %'d in a locale with thousands separators includes the thousands separator in the value, e.g. 12,345. This has one problem; the ' flag is next to impossible to use on the command line, without major quoting games. Oh well, TANSTAAFL. 3. The dfa code has been reinstated; the performance degradation was just too awful. Sigh. (For fun, use `export GAWK_NO_DFA=1' to see the difference.) 4. The special case `x = x y' is now recognized in the grammar, and gawk now uses `realloc' to append the new value to the end of the existing one. This can speed up the common case of appending onto a string. 5. The dfa code was upgraded with most of the fixes from grep 2.5.1, and the regex code was upgraded with GLIBC as mid-January 2004. The regex code is faster than it was, but still not as fast as the dfa code, so the dfa code stays in. The getopt code was also synced to current GLIBC. 6. Support code upgraded to Automake 1.8.5, Autoconf 2.59, and gettext 0.14.1. 7. When --posix is in effect, sub/gsub now follow the 2001 POSIX behavior. Yippee. This is even documented in the manual. 8. Gawk will now recover children that have died (input pipelines, two-way pipes), upon detecting EOF from them, thus avoiding filling up the process table. Open file descriptors are not recovered (unfortunately), since that could break awk semantics. See the ChangeLog and the source code for the details. 9. Handling of numbers like `0,1' in non-American locales ought to work correctly now. 10. IGNORECASE is now locale-aware for characters with values above 128. The dfa matcher is now used for IGNORECASE matches too. 11. Dynamic function loading is better. The documentation has been improved and some new APIs for use by dynamic functions have been added. 12. Gawk now has a fighting chance of working on older systems, a la SunOS 4.1.x. 13. Issues with multibyte support on HP-UX are now resolved. `configure' now disables such support there, since it's not up to what gawk needs. 14. There are now even more tests in the test suite. 15. Various bugs fixed; see ChangeLog for the details. Changes from 3.1.2 to 3.1.3 --------------------------- 1. Gawk now follows POSIX in handling of local numeric formats for input, output and number/string conversions. 2. Multibyte detection improved. See README_d/README.multibyte for more info about multibyte locales. 3. Handling of `close' made more POSIX-compliant for POSIXLY_CORRECT, see the documentation. 4. The record reading code was redone, again. This time it's much better. Really! 5. For RS = "\n" and RS = "", gawk now only sets RT when it has changed. This provides considerable performance improvement. 6. `match' now sets all the subscripts in the third argument array correctly, even if not all subexpressions matched. 7. Updated to Automake 1.7.5. configure.in renamed configure.ac. 8. C-style switch statements are available, but must be enabled at compile time via `configure --enable-switch'. For 3.2 they'll be enabled by default. Thanks to Michael Benzinger for the initial code. 9. %c now always prints no more than one character, whatever precision is provided. 10. strtonum() now works again. 11. Gawk is now much better about scalar/array typing of global uninitiailzed variables passed as parameters. Once the parameter is then used one way or the other, the global var's type is adjusted accordingly. Thanks to Stepan Kasal for the original (considerable) changes. 12. Dynamic function loading under Windows32 should now be possible. See README_d/README.pcdynamic. Thanks to Patrick T.J. McPhee for the changes. 13. Updated to gettext 0.12.1. 14. Gawk now follows historical practice and POSIX for the return value of `rand': It's now 0 <= N < 1. Changes from 3.1.1 to 3.1.2 --------------------------- 1. Loops of the form: for (iggy in foo) next no longer leak memory. 2. gawk -v FIELDWIDTHS="..." now sets PROCINFO["FS"] correctly. 3. All builtin operations and functions should now fully evaluate their arguments so that side effects take place correctly. 4. Fixed a logic bug in gsub/gensub for matches to null strings that occurred later in the string after a nonnull match. 5. getgroups code now works on Ultrix again. 6. Completely new version of the full GNU regex engine now in place. 7. Argument parsing and variable assignment has been cleaned up. 8. An I/O bug on HP-UX has been documented and worked around. See README_d/README.hpux. 9. awklib/grcat should now compile correctly. 10. Updated to automake 1.7.3, autoconf 2.57 and gettext 0.11.5 ; thanks to Paul Eggert for the initial automake and autoconf work. 11. As a result of #6, removed the use of the dfa code from GNU grep. 12. It is now possible to use ptys for |& two-way pipes instead of pipes. The basic plumbing for this was provided by Paolo Bonzini. To make this happen: command = "unix command etc" PROCINFO[command, "pty"] = 1 print ... |& command command |& getline stuff In other words, set the element in PROCINFO *before* opening the two-way pipe, and then gawk will use ptys instead of pipes. On systems without ptys or where all the ptys are in use, gawk will fall back to using plain pipes. 13. Fixed a regex matching across buffer boundaries bug, with a heuristic. See io.c:rsre_get_a_record. 14. Profiling no longer dumps core if there are extension functions in place. 15. Grammar and scanner cleaned up, courtesy of Stepen Kasal, to hopefully once and for all fix the `/=' operator vs. `/=.../' regex ambiguity. Lots of other grammar simplifications applied, as well. 16. BINMODE should work now on more Windows ports. 17. Updated to bison 1.875. Includes fix to bisonfix.sed script. 18. The NODE structure is now 20% (8 bytes) smaller (on x86, anyway), which should help conserve memory. 19. Builds not in the source directory should work again. 20. Arrays now use 2 NODE's per element instead of three. Combined with #18, (on the x86) this reduces the overhead from 120 bytes per element to just 64 bytes: almost a 50% improvement. 21. Programs that make heavy use of changing IGNORECASE should now be much faster, particularly if using a regular expression for FS or RS. IGNORECASE now correctly affects RS regex record splitting, as well. 22. IGNORECASE no longer affects single-character field splitting (FS = "c"), or single-character record splitting (RS = "c"). This cleans up some weird behavior, and makes gawk better match the documentation, which says it only affects regex-based field splitting and record splitting. The documentation on this was improved, too. 23. The framework in test/ has been simplified, making it much easier to add new tests while keeping the size of Makefile.am reasonable. Thanks for this to Stepan Kasal. 24. --lint=invalid causes lint warnings only about stuff that's actually invalid. This needs additional work. 25. More translations. 26. The `get_a_record' routine has been revamped (currently by splitting it into three variants). This should improve long-term maintainability. 27. `match' now adds more entries to 3rd array arg: match("the big dog", /([a-z]+) ([a-z]+) ([a-z]+)/, data) fills in variables: data[1, "start"], data[1, "length"], and so on. 28. New `asorti' function with same interface as `asort', but sorts indices instead of values. 29. Documentation updated to FDL 1.2. 30. New `configure' option --disable-lint at compile time disables lint checking. With GCC dead-code-elimination, cuts almost 200K off the executable size on GNU/Linux x86. Presumably speeds up runtime. Using this will cause some of the tests in the test suite to fail. This option may be removed at a later date. 31. Various minor cleanups, see the ChangeLog for details. Changes from 3.1.0 to 3.1.1 --------------------------- 1. Six new translations. 2. Having more than 4 different values for OFMT and/or CONVFMT now works. 3. The handling of dynamic regexes is now more more sane, esp. w.r.t. the profiling code. The profiling code has been fixed in several places. 4. The return value of index("", "") is now 1. 5. Gawk should no longer close fd 0 in child processes. 6. Fixed test for strtod semantics and regenerated configure. 7. Gawk can now be built with byacc; an accidental bison dependency was removed. 8. `yyerror' will no longer dump core on long source lines. 9. Gawk now correctly queries getgroups(2) to figure out how many groups the process has. 10. New configure option to force use of included strftime, e.g. on Solaris systems. See `./configure --help' for the details. Replaced the included strftime.c with the one from textutils. 11. OS/2 port has been updated. 12. Multi-byte character support has been added, courtesy of IBM Japan. 13. The `for (iggy in foo) delete foo[iggy]' -> `delete foo' optimisation now works. 14. Upgraded to gettext 0.11.2 and automake 1.5. 15. Full gettext compatibility (new dcngettext function). 16. The O'Reilly copyedits and indexing changes for the documentation have been folded into the texinfo version of the manuals. 17. A humongously long value for the AWKPATH environment variable will no longer dump core. 18. Configuration / Installation issues have been straightened out in Makefile.am. Changes from 3.0.6 to 3.1.0 --------------------------- 1. A new PROCINFO array provides info about the process. The non-I/O /dev/xxx files are now obsolete, and their use always generates a warning. 2. A new `mktime' builtin function was added for creating time stamps. The `mktime' function written in awk was removed from the user's guide. 3. New `--gen-po' option creates GNU gettext .po files for strings marked with a leading underscore. 4. Gawk now completely interprets special file names internally, ignoring the existence of real /dev/stdin, /dev/stdout files, etc. 5. The mmap code was removed. It was a worthwhile experiment that just didn't work out. 6. The BINMODE variable is new; on non-UNIX systems it affects how gawk opens files for text vs. binary. 7. The atari port is now unsupported. 8. Gawk no longer supports `next file' as two words. 9. On systems that support it, gawk now sets the `close on exec' flag on all files and pipes it opens. This makes sure that child processes run via `system' or pipes have plenty of file descriptors available. 10. New ports: Tandem and BeOS. The Tandem port is unsupported. 11. If `--posix' is in effect, newlines are not allowed after ?:. 12. Weird OFMT/CONVFMT formats no longer cause fatal errors. 13. Diagnostics about array parameters now include the parameter's name, not just its number. 14. configure should now automatically add -D_SYSV3 for ISC Unix. (This seems to have made it into the gawk 3.0.x line long ago.) 15. It is now possible to open a two-way pipe via the `|&' operator. See the discussion in the manual about putting `sort' into such a pipeline, though. (NOTE! This is borrowed from ksh: it is not the same as the same operator in csh!) 16. The `close' function now takes an optional second string argument that allows closing one or the other end of the two-way pipe to a co-process. This is needed to use `sort' in a co-process, see the doc. 17. If TCP/IP is available, special file names beginning with `/inet' can be used with `|&' for IPC. Thanks to Juergen Kahrs for the initial code. 18. With `--enable-portals' on the configure command line, gawk will also treat file names that start with `/p/' as a 4.4 BSD type portal file, i.e., a two-way pipe for `|&'. 19. Unrecognized escapes, such as "\q" now always generate a warning. 20. The LINT variable is new; it provides dynamic control over the --lint option. 21. Lint warnings can be made fatal by using --lint=fatal or `LINT = "fatal"'. Use this if you're really serious about portable code. 22. Due to an enhanced sed script, there is no longer any need to worry about finding or using alloca. alloca.c is thus now gone. 23. A number of lint warnings have been added. Most notably, gawk will detect if a variable is used before assigned to. Warnings for when a string that isn't a number gets converted to a number are in the code but disabled; they seem to be too picky in practice. Also, gawk will now warn about function parameter names that shadow global variable names. 24. It is now possible to dynamically add builtin functions on systems that support dlopen. This facility is not (yet) as portable or well integrated as it might be. *** WARNING *** THIS FEATURE WILL EVOLVE! 25. There are *many* new tests in the test suite. 26. Profiling has been added! A separate version of gawk, named pgawk, is built and generates a run-time execution profile. The --profile option can be used to change the default output file. In regular gawk, this option pretty-prints the parse tree. 27. Gawk has been internationalized, using GNU gettext. Translations for future distributions are most welcome. Simultaneously, gawk was switched over to using automake. You need Automake 1.4a (from the CVS archive) if you want to muck with the Makefile.am files. 28. New `asort' function for sorting arrays. See the doc for details. 29. The match function takes an optional array third argument to hold the text matched by parenthesized sub-expressions. 30. The bit op functions and octal and hex source code constants are on by default, no longer a configure-time option. Recognition of non-decimal data is now enabled at runtime with --non-decimal-data command line option. 31. Internationalization features available at the awk level: new TEXTDOMAIN variable and `bindtextdomain' and `dcgettext' functions. printf formats may contain the "%2$3.5d" kind of notation for use in translations. See the texinfo manual for details. 32. The return value from `close' has been rationalized. Most notably, closing something that wasn't open returns -1 but remains non-fatal. 33. The array effeciency change from 3.0.5 was reverted; the semantics were not right. Additionally, index values of previously stored elements can no longer change dynamically. 34. The new option --dump-variables dumps a list of all global variables and their final types and values to a file you give, or to `awkvars.out'. 35. Gawk now uses a recent version of random.c courtesy of the FreeBSD project. 36. The gawk source code now uses ANSI C function definitions (new style), with ansi2knr to translate code for old compilers. 37. `for (iggy in foo)' loops should be more robust now in the face of adding/deleting elements in the middle; they loop over just the elements that are present in the array when the loop starts. Changes from 3.0.5 to 3.0.6 --------------------------- This is a bug fix release only, pending further development on 3.1.0. Bugs fixed and changes made: 1. Subscripting an array with a variable that is just a number no longer magically converts the variable into a string. 2. Similarly, running a `for (iggy in foo)' loop where `foo' is a function parameter now works correctly. 3. Similarly, `i = ""; v[i] = a; if (i in v) ...' now works again. 4. Gawk now special cases `for (iggy in foo) delete foo[iggy]' and treats it as the moral equivalent of `delete foo'. This should be a major efficiency win when portably deleting large arrays. 5. VMS port brought up to date. Changes from 3.0.4 to 3.0.5 --------------------------- This is a bug fix release only, pending further development on 3.1.0. Bugs Fixed: 1. `function foo(foo)' is now a fatal error. 2. Array indexing is now much more efficient: where possible, only one copy of an index string is kept, even if used in multiple arrays. 3. Support was added for MacOS X and an `install-strip' target. 4. [s]printf formatting for `0' flag and floating point formats now works correctly. 5. HP-UX large file support with GCC 2.95.1 now works. 6. Arguments that contain `=' but that aren't syntactically valid are now treated as filenames, instead of as fatal errors. 7. `-v NF=foo' now works. 8. Non-ascii alphanumeric characters are now treated as such in the right locales by regex.c. Similarly, a Latin-1 y-umlaut (decimal value 255) in the program text no longer acts like EOF. 9. Array indexes are always compared as strings; fixes an obscure bug when user input gets used for the `x in array' test. 10. The usage message now points users to the documentation for how to report bugs. 11. `/=' now works after an array. 12. `b += b += 1' now works correctly. 13. IGNORECASE changing with calls `match' now works better. (Fix for semi-obscure bug.) 14. Multicharacter values for RS now generate a lint warning. 15. The gawk open file caching is now much more efficient. 16. Global arrays passed to functions are now managed better. In particular, test/arynocls.awk won't crash referencing freed memory. 17. In obscure cases, `getline var' can no longer clobber $0. Changes from 3.0.3 to 3.0.4 --------------------------- This is a bug fix release only, pending further development on 3.1.0. Bugs Fixed: 1. A memory leak when turning a function parameter into an array was fixed. 2. The non-decimal data option now works correctly. 3. Using an empty pair of brackets as an array subscript no longer causes a core dump during parsing. In general, syntax errors should not cause core dumps any more. 4. Standard input is no longer closed if it provides program source, avoiding strange I/O problems. 5. Memory corruption during printing with `print' has been fixed. 6. The gsub function now correctly counts the number of matches. 7. A typo in doc/Makefile.in has been fixed, making installation work. 8. Calling `next' or `nextfile' from a BEGIN or END rule is now fatal. 9. Subtle problems in rebuilding $0 when fields were changed have been fixed. 10. `FS = FS' now correctly turns off the use of FIELDWIDTHS. 11. Gawk now parses fields correctly when FS is a single character. 12. It is now possible for RS to be the NUL character ("\0"). 13. Weird problems with number conversions on MIPS and other systems have been fixed. 14. When parsing using FIELDWIDTHS is in effect, `split' with no third argument will still use the value of FS. 15. Large File Support for Solaris, HP-UX, AIX, and IRIX is now enabled at compile time, thanks to Paul Eggert. 16. Attempting to use the name of a function as a variable or array from within the function is now caught as a fatal error, instead of as a core dump. 17. A bug in parsing hex escapes was fixed. 18. A weird bug with concatenation where one expression has side effects that changes another was fixed. 19. printf/sprintf now behave much better for uses of the '0' and '#' flags and with precisions and field widths. 20. Further strangenesses with concatenation and multiple accesses of some of the special variables was fixed. 21. The Atari port is marked as no longer supported. 22. Build problems on HP-UX have been fixed. 23. Minor fixes and additional explanations added to the documentation. 24. For RS = "", even a single leading newline is now correctly stripped. 25. Obscure parsing problems for regex constants like /=.../ fixed, so that a regex constant is recognized, and not the /= operator. 26. Fixed a bug when closing a redirection that matched the current or last FILENAME. 27. Build problems on AIX fixed. Changes from 3.0.2 to 3.0.3 --------------------------- The horrendous per-record memory leak introduced in 3.0.1 is gone, finally. The `amiga' directory is now gone; Amiga support is now entirely handled by the POSIX support. Windows32 support has been added in the `pc' directory. See `README_d/README.pc' for more info. The mmap changes are disabled in io.c, and will be removed entirely in the next big release. They were an interesting experiment that just really didn't work in practice. A minor memory leak that occurred when using `next' from within a function has also been fixed. Problems with I/O from sub-processes via a pipe are now gone. Using "/dev/pid" and the other special /dev files no longer causes a core dump. The files regex.h, regex.c, getopt.h, getopt.c, and getopt1.c have been merged with the versions in GNU libc. Thanks to Ulrich Drepper for his help. Some new undocumented features have been added. Use the source, Luke! It is not clear yet whether these will ever be fully supported. Array performance should be much better for very very large arrays. "Virtual memory required, real memory helpful." builtin.c:do_substr rationalized, again. The --re-interval option now works as advertised. The license text on some of the missing/* files is now generic. Lots more new test cases. Lots of other small bugs fixed, see the ChangeLog files for details. Changes from 3.0.1 to 3.0.2 --------------------------- Gawk now uses autoconf 2.12. strftime now behaves correctly if passed an empty format string or if the string formats to an empty result string. Several minor compilation and installation problems have been fixed. Minor page break issues in the user's guide have been fixed. Lexical errors no longer repeat ad infinitum. Changes from 3.0.0 to 3.0.1 --------------------------- Troff source for a handy-dandy five color reference card is now provided. Thanks to SSC for their macros. Gawk now behaves like Unix awk and mawk, in that newline acts as white space for separating fields and for `split', by default. In posix mode, only space and tab separate fields. The documentation has been updated to reflect this. Tons and tons of small bugs fixed and new tests added, see the ChangeLogs. Lots fewer compile time warnings from gcc -Wall. Remaining ones aren't worth fixing. Gawk now pays some attention to the locale settings. Fixes to gsub to catch several corner cases. The `print' statement now evaluates all expressions first, and then prints them. This leads to less suprising behaviour if any expression has output side effects. Miscellanious improvements in regex.h and regex.c. Gawk will now install itself as gawk-M.N.P in $(bindir), and link `gawk' to it. This makes it easy to have multiple versions of gawk simultaneously. It will also now install itself as `awk' in $(bindir) if there is no `awk' there. This is in addition to installing itself as `gawk'. This change benefits the Hurd, and possibly other systems. One day, gawk will drop the `g', but not yet. `--posix' turns on interval expressions. Gawk now matches its documentation. `close(FILENAME)' now does something meaningful. Field management code in field.c majorly overhauled, several times. The gensub code has been fixed, several bugs are now gone. Gawk will use mmap for data file input if it is available. The printf/sprintf code has been improved. Minor issues in Makefile setup worked on and improved. builtin.c:do_substr rationalized. Regex matching fixed so that /+[0-9]/ now matches the leading +. For building on vms, the default compiler is now DEC C rather than VAX C. Changes from 2.15.6 to 3.0.0 ---------------------------- Fixed spelling of `Programming' in the copyright notice in all the files. New --re-interval option to turn on interval expressions. They're off by default, except for --posix, to avoid breaking old programs. Passing regexp constants as parameters to user defined functions now generates a lint warning. Several obscure regexp bugs fixed; alas, a small number remain. The manual has been thoroughly revised. It's now almost 50% bigger than it used to be. The `+' modifier in printf is now reset correctly for each item. The do_unix variable is now named do_traditional. Handling of \ in sub and gsub rationalized (somewhat, see the manual for the gory [and I do mean gory] details). IGNORECASE now uses ISO 8859-1 Latin-1 instead of straight ASCII. See the source for how to revert to pure ASCII. --lint will now warn if an assignment occurs in a conditional context. This may become obnoxious enough to need turning off in the future, but "it seemed like a good idea at the time." %hf and %Lf are now diagnosed as invalid in printf, just like %lf. Gawk no longer incorrectly closes stdin in child processes used in input pipelines. For integer formats, gawk now correctly treats the precision as the number of digits to print, not the number of characters. gawk is now much better at catching the use of scalar values when arrays are needed, both in function calls and the `x in y' constructs. New gensub function added. See the manual. If do_tradtional is true, octal and hex escapes in regexp constants are treated literally. This matches historical behavior. yylex/nextc fixed so that even null characters can be included in the source code. do_format now handles cases where a format specifier doesn't end in a control letter. --lint reports an error. strftime() now uses a default time format equivalent to that of the Unix date command, thus it can be called with no arguments. Gawk now catches functions that are used but not defined at parse time instead of at run time. (This is a lint error, making it fatal could break old code.) Arrays that max out are now handled correctly. Integer formats outside the range of an unsigned long are now detected correctly using the SunOS 4.x cc compiler. --traditional option added as new preferred name for --compat, in keeping with GCC. --lint-old option added, so that warnings about things not in old awk are only given if explicitly asked for. `next file' has changed to one word, `nextfile'. `next file' is still accepted but generates a lint warning. `next file' will go away eventually. Gawk with --lint will now notice empty source files and empty data files. Amiga support using the Unix emulation added. Thanks to fnf@ninemoons.com. test/Makefile is now "parallel-make safe". Gawk now uses POSIX regexps + GNU regex ops by default. --posix goes to pure posix regexps, and --compat goes to traditional Unix regexps. However, interval expressions, even though specified by POSIX, are turned off by default, to avoid breaking old code. IGNORECASE now applies to string comparison as well as regexp operations. The AT&T Bell Labs Research awk fflush builtin function is now supported. fflush is extended to flush stdout if no arg and everything if given the null string as an argument. If RS is more than one character, it is treated as a regular expression and records are delimited accordingly. The variable RT is set to the record terminator string. This is disabled in compatibility mode. If FS is set to the null string (or the third arg. of split() is the null string), splitting is done at every single character. This is disabled in compatibility mode. Gawk now uses the Autoconf generated configure script, doing away with all the config/* files and the machinery that went with them. The Makefile.in has also changed accordingly, complete with all the standard GNU Makefile targets. (Non-unix systems may still have their own config.h and Makefile; see the appropriate README_d/README.* and/or subdirectory.) The source code has been cleaned up somewhat and the formatting improved. Changes from 2.15.5 to 2.15.6 ----------------------------- Copyrights updated on all changed files. test directory enhanced with four new tests. Gawk now generates a warning for \x without following hexadecimal digits. In this case, it returns 'x', not \0. Several fixes in main.c related to variable initialization: CONVFMT has a default value resetup is called before initializing variables the varinit table fixed up a bit (see the comments) gawk.1 updated with new BUG REPORTS section. A plain `print' inside a BEGIN or END now generates a lint warning (awk.y). Small fix in iop.c:get_a_record to avoid reading uninitialized memory. awk.y:yylex now does a better job of handling things if the source file does not end in a newline. Probably there is more work to be done. Memory leaks fixed in awk.y, particularly in cases of duplicate function parameters. Also, calling a function doesn't leak memory during parsing. Empty function bodies are now allowed (awk.y). Gawk now detects duplicate parameter names in functions (awk.y). New function `error' in msg.c added for use from awk.y. eval.c:r_get_lhs now checks if its argument is a parameter on the stack, and pulls down the real variable. This catches more 'using an array as a scalar' kinds of errors. main.c recovers C alloca space after parsing, this is important for bison-based parsers. re.c recovers C alloca space after doing an research. [Changes from Pat Rankin] builtin.c now declares the random() related functions based on RANDOM_MISSING from config.h. [Suggested by Pat Rankin] awk.h now handles alloca correctly for HP-UX. [Kaveh Ghazi] regex.h and config/cray60 updated for Unicos 8.0. [Hal Peterson] Fixed re.c and dfa.c so that gawk no longer leaks memory when using lots of dynamic regexps. Removed dependency on signed chars from `idx' variable in awk.h. Gawk now passes its test suite if compiled with `gcc -fno-signed-char'. Fixed warning on close in io.c to go under lint control. Too many people have complained about the spurious message, particularly when closing a child pipeline early. Gawk now correctly handles RS = "" when input is from a terminal (iop.c:get_a_record). Config file added for GNU. gawk 'BEGIN { exit 1 } ; END { exit }' now exits 1, as it should (eval.c:interpret). sub and gsub now follow posix, \ escapes both & and \. Each \ must be doubled initially in the program to get it into the string. Thanks to Mike Brennan for pointing this out (builtin.c:sub_common). If FS is "", gawk behaves like mawk and nawk, making the whole record be $1. Yet Another Dark Corner. Sigh (field.c:def_parse_field). Gawk now correctly recomputes string values for numbers if CONVFMT has changed (awk.h:force_string, node.c:r_force_string). A regexp of the form `/* this looks like a comment but is not */' will now generate a warning from --lint (awk.y). Gawk will no longer core dump if given an empty input file (awk.y:get_src_buf, iop.c:optimal_bufsize). A printf format of the form %lf is handled correctly. The `l' generates a lint warning (builtin.c:format_tree) [Thanks to Mark Moraes]. Lynxos config file added. `continue' outside a loop treated as `next' only in compatibility mode, instead of by default; recent att nawk chokes on this now. `break' outside a loop now treated as `next' in compatibility mode (eval.c). Bug fix in string concatenation, an arbitrary number of expressions are allowed (eval.c). $1 += $2 now works correctly (eval.c). Changing IGNORECASE no longer resets field-splitting to FS if it was using FIELDWIDTHS (eval.c, field.c). Major enhancement: $0 and NF for last record read are now preserved into the END rule (io.c). Regexp fixes: /./ now matches a newline (regex.h) ^ and $ match beginning and end of string only, not any embedded newlines (re.c) regex.c should compile and work ok on 64-bit mips/sgi machines Changes from 2.15.4 to 2.15.5 ----------------------------- FUTURES file updated and re-arranged some with more rational schedule. Many prototypes handled better for ANSI C in protos.h. getopt.c updated somewhat. test/Makefile now removes junk directory, `bardargtest' renamed `badargs.' Bug fix in iop.c for RS = "". Eat trailing newlines off of record separator. Bug fix in Makefile.bsd44, use leading tab in actions. Fix in field.c:set_FS for FS == "\\" and IGNORECASE != 0. Config files updated or added: cray60, DEC OSF/1 2.0, Utek, sgi405, next21, next30, atari/config.h, sco. Fix in io.c for ENFILE as well as EMFILE, update decl of groupset to include OSF/1. Rationalized printing as integers if numbers are outside the range of a long. Changes to node.c:force_string and builtin.c. Made internal NF, NR, and FNR variables longs instead of ints. Add LIMITS_H_MISSING stuff to config.in and awk.h, and default defs for INT_MAX and LONG_MAX, if no limits.h file. Add a standard decl of the time() function for __STDC__. From ghazi@noc.rutgers.edu. Fix tree_eval in awk.h and r_tree_eval in eval.c to deal better with function parameters, particularly ones that are arrays. Fix eval.c to print out array names of arrays used in scalar contexts. Fix eval.c in interpret to zero out source and sourceline initially. This does a better job of providing source file and line number information. Fix to re_parse_field in field.c to not use isspace when RS = "", but rather to explicitly look for blank and tab. Fix to sc_parse_field in field.c to catch the case of the FS character at the end of a record. Lots of miscellanious bug fixes for memory leaks, courtesy Mark Moraes, also fixes for arrays. io.c fixed to warn about lack of explicit closes if --lint. Updated missing/strftime.c to match posted strftime 6.2. Bug fix in builtin.c, in case of non-match in sub_common. Updated constant used for division in builtin.c:do_rand for DEC Alpha and CRAY Y-MP. POSIXLY_CORRECT in the environment turns on --posix (fixed in main.c). Updated srandom prototype and calls in builtin.c. Fix awk.y to enforce posix semantics of unary +: result is numeric. Fix array.c to not rearrange the hash chain upon finding an index in the array. This messed things up in cases like: for (index1 in array) { blah if (index2 in array) # blew away the for stuff } Fixed spelling errors in the man page. Fixes in awk.y so that gawk '' /path/to/file will work without core dumping or finding parse errors. Fix main.c so that --lint will fuss about an empty program. Yet another fix for argument parsing in the case of unrecognized options. Bug fix in dfa.c to not attempt to free null pointers. Bug fix in builtin.c to only use DEFAULT_G_PRECISION for %g or %G. Bug fix in field.c to achieve call by value semantics for split. Changes from 2.15.3 to 2.15.4 ----------------------------- Lots of lint fixes, and do_sprintf made mostly ANSI C compatible. Man page updated and edited. Copyrights updated. Arrays now grow dynamically, initially scaling up by an order of magnitude and then doubling, up to ~ 64K. This should keep gawk's performance graceful under heavy load. New `delete array' feature added. Only documented in the man page. Switched to dfa and regex suites from grep-2.0. These offer the ability to move to POSIX regexps in the next release. Disabled GNU regex ops. Research awk -m option now recognized. It does nothing in gawk, since gawk has no static limits. Only documented in the man page. New bionic (faster, better, stronger than before) hashing function. Bug fix in argument handling. `gawk -X' now notices there was no program. Additional bug fixes to make --compat and --lint work again. Many changes for systems where sizeof(int) != sizeof(void *). Add explicit alloca(0) in io.c to recover space from C alloca. Fixed file descriptor leak in io.c. The --version option now follows the GNU coding standards and exits. Fixed several prototypes in protos.h. Several tests updated. On Solaris, warn that the out? tests will fail. Configuration files for SunOS with cc and Solaris 2.x added. Improved error messages in awk.y on gawk extensions if do_unix or do_compat. INSTALL file added. Fixed Atari Makefile and several VMS specific changes. Better conversion of numbers to strings on systems with broken sprintfs. Changes from 2.15.2 to 2.15.3 ----------------------------- Increased HASHSIZE to a decent number, 127 was way too small. FILENAME is now the null string in a BEGIN rule. Argument processing fixed for invalid options and missing arguments. This version will build on VMS. This included a fix to close all files and pipes opened with redirections before closing stdout and stderr. More getpgrp() defines. Changes for BSD44: in io.c and Makefile.bsd44. All directories in the distribution are now writable. Separated LDFLAGS and CFLAGS in Makefile. CFLAGS can now be overridden by user. Make dist now builds compressed archives ending in .gz and runs doschk. Amiga port. New getopt.c fixes Alpha OSF/1 problem. Make clean now removes possible test output. Improved algorithm for multiple adjacent string concatenations leads to performance improvements. Fix nasty bug whereby command-line assignments, both with -v and at run time, could create variables with syntactically illegal names. Fix obscure bug in printf with %0 flag and filling. Add a lint check for substr if provided length exceeds remaining characters in string. Update atari support. PC support enhanced to include support for both DOS and OS/2. (Lots more #ifdefs. Sigh.) Config files for Hitachi Unix and OSF/1, courtesy of Yoko Morishita (morisita@sra.co.jp) Changes from 2.15.1 to 2.15.2 ----------------------------- Additions to the FUTURES file. Document undefined order of output when using both standard output and /dev/stdout or any of the /dev output files that gawk emulates in the absence of OS support. Clean up the distribution generation in Makefile.in: the info files are now included, the distributed files are marked read-only and patched distributions are now unpacked in a directory named with the patch level. Changes from 2.15 to 2.15.1 --------------------------- Close stdout and stderr before all redirections on program exit. This allows detection of write errors and also fixes the messages test on Solaris 2.x. Removed YYMAXDEPTH define in awk.y which was limiting the parser stack depth. Changes to config/bsd44, Makefile.bsd44 and configure to bring it into line with the BSD4.4 release. Changed Makefile to use prefix, exec_prefix, bindir etc. make install now installs info files. make install now sets permissions on installed files. Make targets added: uninstall, distclean, mostlyclean and realclean. Added config.h to cleaner and clobber make targets. Changes to config/{hpux8x,sysv3,sysv4,ultrix41} to deal with alloca(). Change to getopt.h for portability. Added more special cases to the getpgrp() call. Added README.ibmrt-aos and config/ibmrt-aos. Changes from 2.14 to 2.15 --------------------------- Command-line source can now be mixed with library functions. ARGIND variable tracks index in ARGV of FILENAME. GNU style long options in addition to short options. Plan 9 style special files interpreted by gawk: /dev/pid /dev/ppid /dev/pgrpid /dev/user $1 = getuid $2 = geteuid $3 = getgid $4 = getegid $5 ... $NF = getgroups if supported ERRNO variable contains error string if getline or close fails. Very old options -a and -e have gone away. Inftest has been removed from the default target in test/Makefile -- the results were too machine specific and resulted in too many false alarms. A README.amiga has been added. The "too many arguments supplied for format string" warning message is only in effect under the lint option. Code improvements in dfa.c. Fixed all reported bugs: Writes are checked for failure (such as full filesystem). Stopped (at least some) runaway error messages. gsub(/^/, "x") does the right thing for $0 of 0, 1, or more length. close() on a command being piped to a getline now works properly. The input record will no longer be freed upon an explicit close() of the input file. A NUL character in FS now works. In a substitute, \\& now means a literal backslash followed by what was matched. Integer overflow of substring length in substr() is caught. An input record without a newline termination is handled properly. In io.c, check is against only EMFILE so that system file table is not filled. Renamed all files with names longer than 14 characters. Escaped characters in regular expressions were being lost when IGNORECASE was used. Long source lines were not being handled properly. Sourcefiles that ended in a tab but no newline were bombing. Patterns that could match zero characters in split() were not working properly. The parsedebug option was not working. The grammar was being a bit too lenient, allowing some very dubious programs to pass. Compilation with DEBUG defined now works. A variable read in with getline was not being treated as a potential number. Array subscripts were not always of string type. Changes from 2.13.2 to 2.14 --------------------------- Updated manual! Added "next file" to skip efficiently to the next input file. Fixed potential of overflowing buffer in do_sprintf(). Plugged small memory leak in sub_common(). EOF on a redirect is now "sticky" -- it can only be cleared by close()ing the pipe or file. Now works if used via a #! /bin/gawk line at the top of an executable file when that line ends with whitespace. Added some checks to the grammar to catch redefinition of builtin functions. This could eventually be the basis for an extension to allow redefining functions, but in the mean time it's a good error catching facility. Negative integer exponents now work. Modified do_system() to make sure it had a non-null string to be passed to system(3). Thus, system("") will flush any pending output but not go through the overhead of forking an un-needed shell. A fix to floating point comparisons so that NaNs compare right on IEEE systems. Added code to make sure we're not opening directories for reading and such. Added code to do better diagnoses of weird or null file names. Allow continue outside of a loop, unless in strict posix mode. Lint option will issue warning. New missing/strftime.c. There has been one change that affects gawk. Posix now defines a %V conversion so the vms conversion has been changed to %v. If this version is used with gawk -Wlint and they use %V in a call to strftime, they'll get a warning. Error messages now conform to GNU standard (I hope). Changed comparisons to conform to the description found in the file POSIX. This is inconsistent with the current POSIX draft, but that is broken. Hopefully the final POSIX standard will conform to this version. (Alas, this will have to wait for 1003.2b, which will be a revision to the 1003.2 standard. That standard has been frozen with the broken comparison rules.) The length of a string was a short and now is a size_t. Updated VMS help. Added quite a few new tests to the test suite and deleted many due to lack of written releases. Test output is only removed if it is identical to the "good" output. Fixed a couple of bugs for reference to $0 when $0 is "" -- particularly in a BEGIN block. Fixed premature freeing in construct "$0 = $0". Removed the call to wait_any() in gawk_popen(), since on at least some systems, if gawk's input was from a pipe, the predecessor process in the pipe was a child of gawk and this caused a deadlock. Regexp can (once again) match a newline, if given explicitly. nextopen() makes sure file name is null terminated. Fixed VMS pipe simulation. Improved VMS I/O performance. Catch . used in variable names. Fixed bug in getline without redirect from a file -- it was quitting after the first EOF, rather than trying the next file. Fixed bug in treatment of backslash at the end of a string -- it was bombing rather than doing something sensible. It is not clear what this should mean, but for now I issue a warning and take it as a literal backslash. Moved setting of regexp syntax to before the option parsing in main(), to handle things like -v FS='[.,;]' Fixed bug when NF is set by user -- fields_arr must be expanded if necessary and "new" fields must be initialized. Fixed several bugs in [g]sub() for no match found or the match is 0-length. Fixed bug where in gsub() a pattern anchored at the beginning would still substitute throughout the string. make test does not assume that . is in PATH. Fixed bug when a field beyond the end of the record was requested after $0 was altered (directly or indirectly). Fixed bug for assignment to field beyond end of record -- the assigned value was not found on subsequent reference to that field. Fixed bug for FS a regexp and it matches at the end of a record. Fixed memory leak for an array local to a function. Fixed hanging of pipe redirection to getline Fixed coredump on access to $0 inside BEGIN block. Fixed treatment of RS = "". It now parses the fields correctly and strips leading whitespace from a record if FS is a space. Fixed faking of /dev/stdin. Fixed problem with x += x Use of scalar as array and vice versa is now detected. IGNORECASE now obeyed for FS (even if FS is a single alphabetic character). Switch to GPL version 2. Renamed awk.tab.c to awktab.c for MSDOS and VMS tar programs. Renamed this file (CHANGES) to NEWS. Use fmod() instead of modf() and provide FMOD_MISSING #define to undo this change. Correct the volatile declarations in eval.c. Avoid errant closing of the file descriptors for stdin, stdout and stderr. Be more flexible about where semi-colons can occur in programs. Check for write errors on all output, not just on close(). Eliminate the need for missing/{strtol.c,vprintf.c}. Use GNU getopt and eliminate missing/getopt.c. More "lint" checking. Changes from 2.13.1 to 2.13.2 ----------------------------- Toward conformity with GNU standards, configure is a link to mkconf, the latter to disappear in the next major release. Update to config/bsd43. Added config/apollo, config/msc60, config/cray2-50, config/interactive2.2 sgi33.cc added for compilation using cc rather than gcc. Ultrix41 now propagates to config.h properly -- as part of a general mechanism in configure for kludges -- #define anything from a config file just gets tacked onto the end of config.h -- to be used sparingly. Got rid of an unnecessary and troublesome declaration of vprintf(). Small improvement in locality of error messages. Try to diagnose use of array as scalar and vice versa -- to be improved in the future. Fix for last bug fix for Cray division code--sigh. More changes to test suite to explicitly use sh. Also get rid of a few generated files. Fixed off-by-one bug in string concatenation code. Fix for use of array that is passed in from a previous function parameter. Addition to test suite for above. A number of changes associated with changing NF and access to fields beyond the end of the current record. Change to missing/memcmp.c to avoid seg. fault on zero length input. Updates to test suite (including some inadvertently left out of the last patch) to invoke sh explicitly (rather than rely on #!/bin/sh) and remove some junk files. test/chem/good updated to correspond to bug fixes. Changes from 2.13.0 to 2.13.1 ----------------------------- More configs and PORTS. Fixed bug wherein a simple division produced an erroneous FPE, caused by the Cray division workaround -- that code is now #ifdef'd only for Cray *and* fixed. Fixed bug in modulus implementation -- it was very close to the above code, so I noticed it. Fixed portability problem with limits.h in missing.c Fixed portability problem with tzname and daylight -- define TZNAME_MISSING if strftime() is missing and tzname is also. Better support for Latin-1 character set. Fixed portability problem in test Makefile. Updated PROBLEMS file. =============================== gawk-2.13 released ========================= Changes from 2.12.42 to 2.12.43 ------------------------------- Typo in awk.y Fixed up strftime.3 and added doc. for %V. Changes from 2.12.41 to 2.12.42 ------------------------------- Fixed bug in devopen() -- if you had write permission in /dev, it would just create /dev/stdout etc.!! Final (?) VMS update. Make NeXT use GFMT_WORKAROUND Fixed bug in sub_common() for substitute on zero-length match. Improved the code a bit while I was at it. Fixed grammar so that $i++ parses as ($i)++ Put support/* back in the distribution (didn't I already do this?!) Changes from 2.12.40 to 2.12.41 ------------------------------- VMS workaround for broken %g format. Changes from 2.12.39 to 2.12.40 ------------------------------- Minor man page update. Fixed latent bug in redirect(). Changes from 2.12.38 to 2.12.39 ------------------------------- Updates to test suite -- remove dependence on changing gawk.1 man page. Changes from 2.12.37 to 2.12.38 ------------------------------- Fixed bug in use of *= without whitespace following. VMS update. Updates to man page. Option handling updates in main.c test/manyfiles redone and added to bigtest. Fixed latent (on Sun) bug in handling of save_fs. Changes from 2.12.36 to 2.12.37 ------------------------------- Update REL in Makefile-dist. Incorporate test suite into main distribution. Minor fix in regtest. Changes from 2.12.35 to 2.12.36 ------------------------------- Release takes on dual personality -- 2.12.36 and 2.13.0 -- any further patches before public release won't count for 2.13, although they will for 2.12 -- be careful to avoid confusion! patchlevel.h will be the last thing to change. Cray updates to deal with arithmetic problems. Minor test suite updates. Fixed latent bug in parser (freeing memory). Changes from 2.12.34 to 2.12.35 ------------------------------- VMS updates. Flush stdout at top of err() and stderr at bottom. Fixed bug in eval_condition() -- it wasn't testing for MAYBE_NUM and doing the force_number(). Included the missing manyfiles.awk and a new test to catch the above bug which I am amazed wasn't already caught by the test suite -- it's pretty basic. Changes from 2.12.33 to 2.12.34 ------------------------------- Atari updates -- including bug fix. More VMS updates -- also nuke vms/version.com. Fixed bug in handling of large numbers of redirections -- it was probably never tested before (blush!). Minor rearrangement of code in r_force_number(). Made chem and regtest tests a bit more portable (Ultrix again). Added another test -- manyfiles -- not invoked under any other test -- very Unix specific. Rough beginning of LIMITATIONS file -- need my AWK book to complete it. Changes from 2.12.32 to 2.12.33 ------------------------------- Expunge debug.? from various files. Remove vestiges of Floor and Ceil kludge. Special case integer division -- mainly for Cray, but maybe someone else will benefit. Workaround for iop_close closing an output pipe descriptor on Cray -- not conditional since I think it may fix a bug on SGI as well and I don't think it can hurt elsewhere. Fixed memory leak in assoc_lookup(). Small cleanup in test suite. Changes from 2.12.31 to 2.12.32 ------------------------------- Nuked debug.c and debugging flag -- there are better ways. Nuked version.sh and version.c in subdirectories. Fixed bug in handling of IGNORECASE. Fixed bug when FIELDWIDTHS was set via -v option. Fixed (obscure) bug when $0 is assigned a numerical value. Fixed so that escape sequences in command-line assignments work (as it already said in the comment). Added a few cases to test suite. Moved support/* back into distribution. VMS updates. Changes from 2.12.30 to 2.12.31 ------------------------------- Cosmetic manual page changes. Updated sunos3 config. Small changes in test suite including renaming files over 14 chars. in length. Changes from 2.12.29 to 2.12.30 ------------------------------- Bug fix for many string concatenations in a row. Changes from 2.12.28 to 2.12.29 ------------------------------- Minor cleanup in awk.y Minor VMS update. Minor atari update. Changes from 2.12.27 to 2.12.28 ------------------------------- Got rid of the debugging goop in eval.c -- there are better ways. Sequent port. VMS changes left out of the last patch -- sigh! config/vms.h renamed to config/vms-conf.h. Fixed missing/tzset.c Removed use of gcvt() and GCVT_MISSING -- turns out it was no faster than sprintf("%g") and caused all sorts of portability headaches. Tuned get_field() -- it was unnecessarily parsing the whole record on reference to $0. Tuned interpret() a bit in the rule_node loop. In r_force_number(), worked around bug in Uglix strtod() and got rid of ugly do{}while(0) at Michal's urging. Replaced do_deref() and deref with unref(node) -- much cleaner and a bit faster. Got rid of assign_number() -- contrary to comment, it was no faster than just making a new node and freeing the old one. Replaced make_number() and tmp_number() with macros that call mk_number(). Changed freenode() and newnode() into macros -- the latter is getnode() which calls more_nodes() as necessary. Changes from 2.12.26 to 2.12.27 ------------------------------- Completion of Cray 2 port (includes a kludge for floor() and ceil() that may go or be changed -- I think that it may just be working around a bug in chem that is being tweaked on the Cray). More VMS updates. Moved kludge over yacc's insertion of malloc and realloc declarations from protos.h to the Makefile. Added a lisp interpreter in awk to the test suite. (Invoked under bigtest.) Cleanup in r_force_number() -- I had never gotten around to a thorough profile of the cache code and it turns out to be not worth it. Performance boost -- do lazy force_number()'ing for fields etc. i.e. flag them (MAYBE_NUM) and call force_number only as necessary. Changes from 2.12.25 to 2.12.26 ------------------------------- Rework of regexp stuff so that dynamic regexps have reasonable performance -- string used for compiled regexp is stored and compared to new string -- if same, no recompilation is necessary. Also, very dynamic regexps cause dfa-based searching to be turned off. Code in dev_open() is back to returning fileno(std*) rather than dup()ing it. This will be documented. Sorry for the run-around on this. Minor atari updates. Minor vms update. Missing file from MSDOS port. Added warning (under lint) if third arg. of [g]sub is a constant and handle it properly in the code (i.e. return how many matches). Changes from 2.12.24 to 2.12.25 ------------------------------- MSDOS port. Non-consequential changes to regexp variables in preparation for a more serious change to fix a serious performance problem. Changes from 2.12.23 to 2.12.24 ------------------------------- Fixed bug in output flushing introduced a few patches back. This caused serious performance losses. Changes from 2.12.22 to 2.12.23 ------------------------------- Accidentally left config/cray2-60 out of last patch. Added some missing dependencies to Makefile. Cleaned up mkconf a bit; made yacc the default parser (no alloca needed, right?); added rs6000 hook for signed characters. Made regex.c with NO_ALLOCA undefined work. Fixed bug in dfa.c for systems where free(NULL) bombs. Deleted a few cant_happen()'s that *really* can't hapen. Changes from 2.12.21 to 2.12.22 ------------------------------- Added to config stuff the ability to choose YACC rather than bison. Fixed CHAR_UNSIGNED in config.h-dist. Second arg. of strtod() is char ** rather than const char **. stackb is now initially malloc()'ed since it may be realloc()'ed. VMS updates. Added SIZE_T_MISSING to config stuff and a default typedef to awk.h. (Maybe it is not needed on any current systems??) re_compile_pattern()'s size is now size_t unconditionally. Changes from 2.12.20 to 2.12.21 ------------------------------- Corrected missing/gcvt.c. Got rid of use of dup2() and thus DUP_MISSING. Updated config/sgi33. Turned on (and fixed) in cmp_nodes() the behaviour that I *hope* will be in POSIX 1003.2 for relational comparisons. Small updates to test suite. Changes from 2.12.19 to 2.12.20 ------------------------------- Sloppy, sloppy, sloppy!! I didn't even try to compile the last two patches. This one fixes goofs in regex.c. Changes from 2.12.18 to 2.12.19 ------------------------------- Cleanup of last patch. Changes from 2.12.17 to 2.12.18 ------------------------------- Makefile renamed to Makefile-dist. Added alloca() configuration to mkconf. (A bit kludgey.) Just add a single line containing ALLOCA_PW, ALLOCA_S or ALLOCA_C to the appropriate config file to have Makefile-dist edited accordingly. Reorganized output flushing to correspond with new semantics of devopen() on "/dev/std*" etc. Fixed rest of last goof!! Save and restore errno in do_pathopen(). Miscellaneous atari updates. Get rid of the trailing comma in the NODETYPE definition (Cray compiler won't take it). Try to make the use of `const' consistent since Cray compiler is fussy about that. See the changes to `basename' and `myname'. It turns out that, according to section 3.8.3 (Macro Replacement) of the ANSI Standard: ``If there are sequences of preprocessing tokens within the list of arguments that would otherwise act as preprocessing directives, the behavior is undefined.'' That means that you cannot count on the behavior of the declaration of re_compile_pattern in awk.h, and indeed the Cray compiler chokes on it. Replaced alloca with malloc/realloc/free in regex.c. It was much simpler than expected. (Inside NO_ALLOCA for now -- by default no alloca.) Added a configuration file, config/cray60, for Unicos-6.0. Changes from 2.12.16 to 2.12.17 ------------------------------- Ooops. Goofed signal use in last patch. Changes from 2.12.15 to 2.12.16 ------------------------------- RENAMED *_dir to just * (e.g. missing_dir). Numerous VMS changes. Proper inclusion of atari and vms files. Added experimental (ifdef'd out) RELAXED_CONTINUATION and DEFAULT_FILETYPE -- please comment on these! Moved pathopen() to io.c (sigh). Put local directory ahead in default AWKPATH. Added facility in mkconf to echo comments on stdout: lines beginning with "#echo " will have the remainder of the line echoed when mkconf is run. Any lines starting with "#" will otherwise be treated as comments. The intent is to be able to say: "#echo Make sure you uncomment alloca.c in the Makefile" or the like. Prototype fix for V.4 Fixed version_string to not print leading @(#). Fixed FIELDWIDTHS to work with strict (turned out to be easy). Fixed conf for V.2. Changed semantics of /dev/fd/n to be like on real /dev/fd. Several configuration and updates in the makefile. Updated manpage. Include tzset.c and system.c from missing_dir that were accidently left out of the last patch. Fixed bug in cmdline variable assignment -- arg was getting freed(!) in call to variable. Backed out of parse-time constant folding for now, until I can figure out how to do it right. Fixed devopen() so that getline <"-" works. Changes from 2.12.14 to 2.12.15 ------------------------------- Changed config/* to a condensed form that can be used with mkconf to generate a config.h from config.h-dist -- much easier to maintain. Please check carefully against what you had before for a particular system and report any problems. vms.h remains separate since the stuff at the bottom didn't quite fit the mkconf model -- hopefully cleared up later. Fixed bug in grammar -- didn't allow function definition to be separated from other rules by a semi-colon. VMS fix to #includes in missing.c -- should we just be including awk.h? Updated README for texinfo.tex version. Updating of copyright in all .[chy] files. Added but commented out Michal's fix to strftime. Added tzset() emulation based on Rick Adams' code. Added TZSET_MISSING to config.h-dist. Added strftime.3 man page for missing_dir More posix: func, **, **= don't work in -W posix More lint: ^, ^= not in old awk gawk.1: removed ref to -DNO_DEV_FD, other minor updating. Style change: pushbak becomes pushback() in yylex(). Changes from 2.12.13 to 2.12.14 ------------------------------- Better (?) organization of awk.h -- attempt to keep all system dependencies near the top and move some of the non-general things out of the config.h files. Change to handling of SYSTEM_MISSING. Small change to ultrix config. Do "/dev/fd/*" etc. checking at runtime. First pass at VMS port. Improvements to error handling (when lexeme spans buffers). Fixed backslash handling -- why didn't I notice this sooner? Added programs from book to test suite and new target "bigtest" to Makefile. Changes from 2.12.12 to 2.12.13 ------------------------------- Recognize OFS and ORS specially so that OFS = 9 works without efficiency hit. Took advantage of opportunity to tune do_print*() for about 10% win on a print with 5 args (i.e. small but significant). Somewhat pervasive changes to reconcile CONVFMT vs. OFMT. Better initialization of builtin vars. Make config/* consistent wrt STRTOL_MISSING. Small portability improvement to alloca.s Improvements to lint code in awk.y Replaced strtol() with a better one by Chris Torek. Changes from 2.12.11 to 2.12.12 ------------------------------- Added PORTS file to record successful ports. Added #define const to nothing if not STDC and added const to strtod() header. Added * to printf capabilities and partially implemented ' ' and '+' (has an effect for %d only, silently ignored for other formats). I'm afraid that's as far as I want to go before I look at a complete replacement for do_sprintf(). Added warning for /regexp/ on LHS of MATCHOP. Changes from 2.12.10 to 2.12.11 ------------------------------- Small Makefile improvements. Some remaining nits from the NeXT port. Got rid of bcopy() define in awk.h -- not needed anymore (??) Changed private in builtin.c -- it is special on Sequent. Added subset implementation of strtol() and STRTOL_MISSING. A little bit of cleanup in debug.c, dfa.c. Changes from 2.12.9 to 2.12.10 ------------------------------ Redid compatability checking and checking for # of args. Removed all references to variables[] from outside awk.y, in preparation for a more abstract interface to the symbol table. Got rid of a remaining use of bcopy() in regex.c. Changes from 2.12.8 to 2.12.9 ----------------------------- Portability improvements for atari, next and decstation. Bug fix in substr() -- wasn't handling 3rd arg. of -1 properly. Manpage updates. Moved support from src release to doc release. Updated FUTURES file. Added some "lint" warnings. Changes from 2.12.7 to 2.12.8 ----------------------------- Changed time() to systime(). Changed warning() in snode() to fatal(). strftime() now defaults second arg. to current time. Changes from 2.12.6 to 2.12.7 ----------------------------- Fixed bug in sub_common() involving inadequate allocation of a buffer. Added some missing files to the Makefile. Changes from 2.12.5 to 2.12.6 ----------------------------- Fixed bug wherein non-redirected getline could call iop_close() just prior to a call from do_input(). Fixed bug in handling of /dev/stdout and /dev/stderr. Changes from 2.12.4 to 2.12.5 ----------------------------- Updated README and support directory. Changes from 2.12.3 to 2.12.4 ----------------------------- Updated CHANGES and TODO (should have been done in previous 2 patches). Changes from 2.12.2 to 2.12.3 ----------------------------- Brought regex.c and alloca.s into line with current FSF versions. Changes from 2.12.1 to 2.12.2 ----------------------------- Portability improvements; mostly moving system prototypes out of awk.h Introduction of strftime. Use of CONVFMT. Changes from 2.12 to 2.12.1 ----------------------------- Consolidated treatment of command-line assignments (thus correcting the -v treatment). Rationalized builtin-variable handling into a table-driven process, thus simplifying variable() and eliminating spc_var(). Fixed bug in handling of command-line source that ended in a newline. Simplified install() and lookup(). Did away with double-mallocing of identifiers and now free second and later instances of a name, after the first gets installed into the symbol table. Treat IGNORECASE specially, simplifying a lot of code, and allowing checking against strict conformance only on setting it, rather than on each pattern match. Fixed regexp matching when IGNORECASE is non-zero (broken when dfa.c was added). Fixed bug where $0 was not being marked as valid, even after it was rebuilt. This caused mangling of $0. Changes from 2.11.1 to 2.12 ----------------------------- Makefile: Portability improvements in Makefile. Move configuration stuff into config.h FSF files: Synchronized alloca.[cs] and regex.[ch] with FSF. array.c: Rationalized hash routines into one with a different algorithm. delete() now works if the array is a local variable. Changed interface of assoc_next() and avoided dereferencing past the end of the array. awk.h: Merged non-prototype and prototype declarations in awk.h. Expanded tree_eval #define to short-circuit more calls of r_tree_eval(). awk.y: Delinted some of the code in the grammar. Fixed and improved some of the error message printing. Changed to accomodate unlimited length source lines. Line continuation now works as advertised. Source lines can be arbitrarily long. Refined grammar hacks so that /= assignment works. Regular expressions starting with /= are recognized at the beginning of a line, after && or || and after ~ or !~. More contexts can be added if necessary. Fixed IGNORECASE (multiple scans for backslash). Condensed expression_lists in array references. Detect and warn for correct # args in builtin functions -- call most of them with a fixed number (i.e. fill in defaults at parse-time rather than at run-time). Load ENVIRON only if it is referenced (detected at parse-time). Treat NF, FS, RS, NR, FNR specially at parse time, to improve run time. Fold constant expressions at parse time. Do make_regexp() on third arg. of split() at parse tiem if it is a constant. builtin.c: srand() returns 0 the first time called. Replaced alloca() with malloc() in do_sprintf(). Fixed setting of RSTART and RLENGTH in do_match(). Got rid of get_{one,two,three} and allowance for variable # of args. at run-time -- this is now done at parse-time. Fixed latent bug in [g]sub whereby changes to $0 would never get made. Rewrote much of sub_common() for simplicity and performance. Added ctime() and time() builtin functions (unless -DSTRICT). ctime() returns a time string like the C function, given the number of seconds since the epoch and time() returns the current time in seconds. do_sprintf() now checks for mismatch between format string and number of arguments supplied. dfa.c This is borrowed (almost unmodified) from GNU grep to provide faster searches. eval.c Node_var, Node_var_array and Node_param_list handled from macro rather than in r_tree_eval(). Changed cmp_nodes() to not do a force_number() -- this, combined with a force_number() on ARGV[] and ENVIRON[] brings it into line with other awks Greatly simplified cmp_nodes(). Separated out Node_NF, Node_FS, Node_RS, Node_NR and Node_FNR in get_lhs(). All adjacent string concatenations now done at once. field.c Added support for FIELDWIDTHS. Fixed bug in get_field() whereby changes to a field were not always properly reflected in $0. Reordered tests in parse_field() so that reference off the end of the buffer doesn't happen. set_FS() now sets *parse_field i.e. routine to call depending on type of FS. It also does make_regexp() for FS if needed. get_field() passes FS_regexp to re_parse_field(), as does do_split(). Changes to set_field() and set_record() to avoid malloc'ing and free'ing the field nodes repeatedly. The fields now just point into $0 unless they are assigned to another variable or changed. force_number() on the field is *only* done when the field is needed. gawk.1 Fixed troff formatting problem on .TP lines. io.c Moved some code out into iop.c. Output from pipes and system() calls is properly synchronized. Status from pipe close properly returned. Bug in getline with no redirect fixed. iop.c This file contains a totally revamped get_a_record and associated code. main.c Command line programs no longer use a temporary file. Therefore, tmpnam() no longer required. Deprecated -a and -e options -- they will go away in the next release, but for now they cause a warning. Moved -C, -V, -c options to -W ala posix. Added -W posix option: throw out \x Added -W lint option. node.c force_number() now allows pure numerics to have leading whitespace. Added make_string facility to optimize case of adding an already malloc'd string. Cleaned up and simplified do_deref(). Fixed bug in handling of stref==255 in do_deref(). re.c contains the interface to regexp code Changes from 2.11.1 to FSF version of same ------------------------------------------ Thu Jan 4 14:19:30 1990 Jim Kingdon (kingdon at albert) * Makefile (YACC): Add -y to bison part. * missing.c: Add #include . Sun Dec 24 16:16:05 1989 David J. MacKenzie (djm at hobbes.ai.mit.edu) * Makefile: Add (commented out) default defines for Sony News. * awk.h: Move declaration of vprintf so it will compile when -DVPRINTF_MISSING is defined. Mon Nov 13 18:54:08 1989 Robert J. Chassell (bob at apple-gunkies.ai.mit.edu) * gawk.texinfo: changed @-commands that are not part of the standard, currently released texinfmt.el to those that are. Otherwise, only people with the as-yet unreleased makeinfo.c can format this file. Changes from 2.11beta to 2.11.1 (production) -------------------------------------------- Went from "beta" to production status!!! Now flushes stdout before closing pipes or redirected files to synchronize output. MS-DOS changes added in. Signal handler return type parameterized in Makefile and awk.h and some lint removed. debug.c cleaned up. Fixed FS splitting to never match null strings, per book. Correction to the manual's description of FS. Some compilers break on char *foo = "string" + 4 so fixed version.sh and main.c. Changes from 2.10beta to 2.11beta --------------------------------- This release fixes all reported bugs that we could reproduce. Probably some of the changes are not documented here. The next release will probably not be a beta release! The most important change is the addition of the -nostalgia option. :-) The documentation has been improved and brought up-to-date. There has been a lot of general cleaning up of the code that is not otherwise documented here. There has been a movement toward using standard-conforming library routines and providing them (in missing.d) for systems lacking them. Improved (hopefully) configuration through Makfile modifications and missing.c. In particular, straightened out confusion over vprintf #defines, declarations etc. Deleted RCS log comments from source, to reduce source size by about one third. Most of them were horribly out-of-date, anyway. Renamed source files to reflect (for the most part) their contents. More and improved error messages. Cleanup and fixes to yyerror(). String constants are not altered in input buffer, so error messages come out better. Fixed usage message. Make use of ANSI C strerror() function (provided). Plugged many more memory leaks. The memory consumption is now quite reasonable over a wide range of programs. Uses volatile declaration if STDC > 0 to avoid problems due to longjmp. New -a and -e options to use awk or egrep style regexps, respectively, since POSIX says awk should use egrep regexps. Default is -a. Added -v option for setting variables before the first file is encountered. Version information now uses -V and copyleft uses -C. Added a patchlevel.h file and its use for -V and -C. Append_right() optimized for major improvement to programs with a *lot* of statements. Operator precedence has been corrected to match draft Posix. Tightened up grammar for builtin functions so that only length may be called without arguments or parentheses. /regex/ is now a normal expression that can appear in any expression context. Allow /= to begin a regexp. Allow ..[../..].. in a regexp. Allow empty compound statements ({}). Made return and next illegal outside a function and in BEGIN/END respectively. Division by zero is now illegal and causes a fatal error. Fixed exponentiation so that x ^ 0 and x ^= 0 both return 1. Fixed do_sqrt, do_log, and do_exp to do argument/return checking and print an error message, per the manual. Fixed main to catch SIGSEGV to get source and data file line numbers. Fixed yyerror to print the ^ at the beginning of the bad token, not the end. Fix to substr() builtin: it was failing if the arguments weren't already strings. Added new node value flag NUMERIC to indicate that a variable is purely a number as opposed to type NUM which indicates that the node's numeric value is valid. This is set in make_number(), tmp_number and r_force_number() when appropriate and used in cmp_nodes(). This fixed a bug in comparison of variables that had numeric prefixes. The new code uses strtod() and eliminates is_a_number(). A simple strtod() is provided for systems lacking one. It does no overflow checking, so could be improved. Simplification and efficiency improvement in force_string. Added performance tweak in r_force_number(). Fixed a bug with nested loops and break/continue in functions. Fixed inconsistency in handling of empty fields when $0 has to be rebuilt. Happens to simplify rebuild_record(). Cleaned up the code associated with opening a pipe for reading. Gawk now has its own popen routine (gawk_popen) that allocates an IOBUF and keeps track of the pid of the child process. gawk_pclose marks the appropriate child as defunct in the right struct redirect. Cleaned up and fixed close_redir(). Fixed an obscure bug to do with redirection. Intermingled ">" and ">>" redirects did not output in a predictable order. Improved handling of output buffering: now all print[f]s redirected to a tty or pipe are flushed immediately and non-redirected output to a tty is flushed before the next input record is read. Fixed a bug in get_a_record() where bcopy() could have copied over a random pointer. Fixed a bug when RS="" and records separated by multiple blank lines. Got rid of SLOWIO code which was out-of-date anyway. Fix in get_field() for case where $0 is changed and then $(n) are changed and then $0 is used. Fixed infinite loop on failure to open file for reading from getline. Now handles redirect file open failures properly. Filenames such as /dev/stdin now allowed on the command line as well as in redirects. Fixed so that gawk '$1' where $1 is a zero tests false. Fixed parsing so that `RLENGTH -1' parses the same as `RLENGTH - 1', for example. The return from a user-defined function now defaults to the Null node. This fixes a core-dump-causing bug when the return value of a function is used and that function returns no value. Now catches floating point exceptions to avoid core dumps. Bug fix for deleting elements of an array -- under some conditions, it was deleting more than one element at a time. Fix in AWKPATH code for running off the end of the string. Fixed handling of precision in *printf calls. %0.2d now works properly, as does %c. [s]printf now recognizes %i and %X. Fixed a bug in printing of very large (>240) strings. Cleaned up erroneous behaviour for RS == "". Added IGNORECASE support to index(). Simplified and fixed newnode/freenode. Fixed reference to $(anything) in a BEGIN block. Eliminated use of USG rand48(). Bug fix in force_string for machines with 16-bit ints. Replaced use of mktemp() with tmpnam() and provided a partial implementation of the latter for systems that don't have it. Added a portability check for includes in io.c. Minor portability fix in alloc.c plus addition of xmalloc(). Portability fix: on UMAX4.2, st_blksize is zero for a pipe, thus breaking iop_alloc() -- fixed. Workaround for compiler bug on Sun386i in do_sprintf. More and improved prototypes in awk.h. Consolidated C escape parsing code into one place. strict flag is now turned on only when invoked with compatability option. It now applies to fewer things. Changed cast of f._ptr in vprintf.c from (unsigned char *) to (char *). Hopefully this is right for the systems that use this code (I don't). Support for pipes under MSDOS added.