Q L   H A C K E R ' S   J O U R N A L
      ===========================================
           Supporting  All  QL  Programmers
      ===========================================
         #27                       January 1998 
      
    The QL Hacker's Journal (QHJ) is published by Tim
Swenson as a service to the QL Community.  The QHJ is
freely distributable.  Past issues are available on disk,
via e-mail, or via the Anon-FTP server, garbo.uwasa.fi. 
The QHJ is always on the look out for article submissions.

  QL Hacker's Journal
     c/o Tim Swenson
     38725 Lexington St. #230
     Fremont, CA 94536
     swensontc@geocities.com
     http://www.geocities.com/SilconValley/Pines/5865/


EDITORS' FORUMN

The QHJ is back.  After a year of taking a break, I'm back
in the programming spirit again.  Of course, I have not been
inactive during that time, as any reader of QL Today can
attest.  I just have not felt like writing any programs for
a while.  I guess I did get burnt out a bit.  Now we'll see
how long before I get burnt out again.

Having recently purchased Qliberator, I have found its
manual similar to the original QL manual, full of
information, but kind of hard to find without reading the
whole manual.  I'm all for reading the whole manual, but
sometimes it takes a while to figure out exactly how to
apply what you are reading.  I sometimes like manuals that
are more "If you want to do this, this is how to do it."

From this thought, came the idea for the "Qlib Source Book",
which will be something similar to the "Z88 Source Book".
The Z88 Source Book was a collection of existing knowledge
about the Z88.  Most of the Z88 Source Book came from older
published sources.  With Qlib, there does not seem to be a
wealth of published material helping the beginning Qlib
user.  So, time to send out a query and ask for material.

If you are an experienced Qlib user and have a few tricks
that you would like to pass on, please send them to me
(either hard copy, disk or e-mail).  If you are a beginning
Qlib user and you have questions that you would like to see
answered, send them too.  Since I do not have the knowledge
to really do the subject well, I will play the role of
editor.  I'll collect the different submissions and put them
in an organized document.  The Qlib Source Book will be
Freeware in its electronic form.  Like the Z88 Source Book,
a hard copy version will probably be available at minimal
cost.  With the Z88 Source Book the price of the book
covered the cost of production and a small profit to FWD
Computing.  It was my way of supporting the primary US QL
dealer.

Through the QHJ and QL Today I'll keep QLers informed of my
progress.  I've already volunteered Dilwyn Jones to help in
writing some parts.  Dilwyn has a number of years of
experience with Qliberator and producing commercial
software.

So, here is the long awaited next issue of the QHJ.  Feel
free to send any comments, complaints, articles, large
denomination bills, etc.  Enjoy.


REGULAR EXPRESSIONS

In all the years that I've been dealing with Unix, one of
the things that I have not taken the time to really learn is
Regular Expressions.  Regular expressions are based on a
mini-language used for pattern matching in a number of Unix
search utilities.  The most well known of these programs is
grep and its variations fgrep and egrep.  The term 'grep' is
even derived from the words 'regular expression'.

No matter what operating system you have used, you have
probably run across a regular expression.  Most operating
systems have a way of understanding something like this;
"dir *.txt".  In MS-DOS this means to list all files that
end with a .txt extension.  In QDOS, the equivilent phrase
would be "wdir flp1__txt".  The asterisk or star, "*", is a
wild card and means to match all strings.  The asterisk is
really a metacharacter.  Metacharacters are special
characters that mean different things in the regular
expression language.  More experienced users of MS-DOS may
have used something like this; "dir *.e??".  This means to
match all files that start with a .e in the extension.  It
will match .exe, .efs, .exx, and others.  The question mark
is a metacharacter that means to match any character of
length one.

So what does all this means to QDOS users?  Well, a version
of grep has been ported to the QL and comes with the C68
distribution.  Grep is a very powerful and popular utility
that can fill a number of needs.  It is used to extract
lines of text from files, but with its handling of regular
expressions, it can be very smart on what it extracts.  Once
you know how grep works and how to use it, you will probably
remember a time when it would have been useful to you.

With grep, you can do two things with its output, it can go
to standard output or you can redirect it to a file.  Since
the QL does not have standard output, the QL version of grep
opens a window to display its results.  it also supports
file direction.  This means that you can send the output of
grep to a file to be dealt with later.

To demonstrate the file redirection, lets take a look at a
short grep example.  In this example we have a text file and
we want to find all lines that have the word QL in them:

   exec flp1_grep;"ql flp1_file_in > flp1_file_out"

Since we are using arguements, we have to put them in quotes
after the grep command.  The results of the grep will now be
in th file flp1_file_out.

Before we go to far, let's talk about three major concepts
in regular expressions: characters, metacharacters, and
character classes.  A character is basically a byte, be it a
text byte or binary byte.  Metacharacters are a set of
characters that are part of the regular expression language.
In the examples above, the asterisk is a metacharacter.  A
character class is a way of matching a group of characters.

Let's take a look at the metacharacters:

A character matches itself.  Any character or string of
characters are taken as literals.  If you want to find the
string "ing" in a file you would use the regular expression
"ing".  Most of the times when I am using grep, I use only
literal characters.

A dot (.) matches any character, but only 1 character,
similar to the question mark in MS-DOS.  If you want to find
a word in a text file that has three letters, starts with a
B and ends with D, then you would use the regular expression
B.D (grep is case sensitive.  Upper case lettering has only
been used to highlight the example.).

The caret (^) means the beginning of a line.  If you want to
find all lines that start with the word "The", you would use
the regular expression "^The".

The dollar sign ($) means the end of a line.  If you want to
find all lines that end with the word "end", you would use
the regular expression "end$".

The question mark (?) is used to match an optional
character.  If you wanted to find the word "color" but don't
know if the British spelling "colour" is used, the regular
expression "colo?r" would work.  The ? means optional.

The plus (+) is used to match one or more items.  If you
want to find the words helper or helps, but not just help,
you would use the regular expression "help+".  The plus must
match at least one character or it will fail.

The asterisk (*) is used like +, but it allows a null match.
To find the words helper, helps and help, the regular
expression "help*" would work.  The asterisk allows for no
character, as in the case of just help.

To get a little more power out of regular expressions, there
is a metacharacter for the logical OR, the pipe symbol (|).
Say you have a text file with a bunch of e-mail messages and
you want to find all of the From and Subject lines, you
would use the regular expression "From|Subject".


Now that you know how to use the OR metacharacter, you will
find that you need to limit the OR.  That's were the
parentheses () come in.  Using the last example of finding
the From and Subject lines from e-mail messages, using the
regular expression "From|Subject" will also find lies with
either word in them.  With e-mails, the From in the From
line is always followed by a colon; "From:".  The same goes
for Subject.  Now how do we write a regular expression for
this?  One way is this: "From:|Subject:".  This will work,
but a "cleaner" approach is this: "(From|Subject):".  Since
AND's are assumed in regular expressions, what you get is
this "( From OR Subject ) AND :".  Just like in math, the
parentheses control the bounds of the OR condition.

The backslash (\) is used to make a metacharacter a literal.
If you want to look for all lines that end will a full
sentence, meaning they end with a period, you could use the
following regular expression: ".$".  But, since the period
is a metacharacter you will find all lines that end with a
character.  To get grep to use the period as a period, you
need to use the backslash like this; "\.$".  The backslash
tells grep to take the next character as a literal and not
to interpret it.

Character classes are used as a way to search for groups of
characters.  Say you wanted to match the numbers less than
4.  You could do this with "(1|2|3)".  Using the brackets,
you could also create a character class "[123]".  The true
power of the character class comes when using the period.
The period means to create a range of characters
(Metacharacters mean something else when in a character
class).  In the last example, the character class could also
be written as "[1.3]", meaning all characters from 1 to 3.
To define the letters of the alphabet the character class
would be "[a.z]".  Since grep is case sensitive, a better
character class would be "[a.zA.Z]".

You can mix up characters in a character class any way you
like.  Say you have to find all occurances of numberical
dates in a file.  Dates could be defined as 7-23-97, or
7/23/97, or even 7.23.97.  You want to find any dates with a
dash, slash, or period.  You would create the character
class "[-/.]".  Remember that the period means only itself
when inside a character class and does not mean to match a
single character.  So to find our dates, we would use the
regular expression "7[-/.]23[-/.]97".

The caret (^) means something else when used in a character
class; it means to negate the class.  If you want to match
anything but numbers, you would create the character class
"[^0.9]".  The caret works to negate when it is immediately
used after the opening bracket.  If it is used after that it
only means itself.  The character class "[-.^]" matches only
a dash, period, or caret.

If you are interested in learning more, check out the book
"Mastering Regular Expressions" by Jeffery Friedl.


END-OF-FILE FINDING

A lot of the programs that I like to write are filters.
They take a text file as input, do something to the file,
and output the results to a second file.  Doing this
involves reading a file one line at a time.  A way of doing
this would be something like this:

     REPeat loop
        INPUT #4,in$
        IF EOF(#4) THEN EXIT loop
        PRINT in$
     END REPeat loop

This algorithm will work, except that it will not output the
last line.  When I first tried this, I could not figure out
why the last line was not being output.  It was all based on
how I saw the program being executed.  I thought that the
INPUT statement would read in the end-of-file (EOF) marker
and then do a compare.  What is really happening is that the
last line is read in, then the EOF check is made.  Since the
file pointer advanced after reading in the last string, it
is now pointing at the EOF marker.  When the EOF check is
done, it returns TRUE and the EXIT loop is done.  A better
example would be this:

     REPeat loop
        IF EOF(#4) THEN EXIT loop
        INPUT #4,in$
        PRINT in$
     END REPeat loop
This will print out the last line of the file.  But, this
algorithm also has its faults.  It assumes that there is an
end-of-line (EOL) marker at the end of the last line.  If
there was not EOL and only the EOF, an error would occur
reading in the last line.

A better routine would read in each character and put the
line together while constantly checking for an EOF.  Here is
an example:

     DEF PROCedure read_line
        in$=""
        REPeat loop
           IF EOF(#4) THEN EXIT loop
           byte$ = INKEY$(#4,-1)
           in$ = in$ & byte$
        END REPeat loop
        RETURN in$
     END DEF read_line

It would be used like this:

    next_line$ = read_line

If using Qliberator, you can use the Q_ERR function to
locate EOF.  Q_ERR can only trap for EOF after the fact.
You keep reading through the file until you get an EOF
error, which is trapped by Q_ERR.  This means that you would
check for Q_ERR/EOF after an INPUT statement.  An example
is:

     Q_ERR_ON "INPUT"
     REPEAT loop
        INPUT #4,in$
        IF Q_ERR = -10 THEN EXIT loop
        PRINT in$
     END REPEAT loop
     Q_ERR_OFF


BACKGROUND PROGRAMS

Back in the hey-days of MS-DOS, before MS-Windows, there was
a neat type of program called "Terminate & Stay Ready"
(TSR).  The program could be loaded up at boot time, remain
in memory while other programs were running, and could be
called up at any time.  The program would stay in the
background until a funny key sequence was typed in, then it
would pop-up in front of the current program and be ready to
do something.  Sidekick was the first popular program to do
this.

Since MS-DOS could not multitask, how this was done is still
a mystery to me.  In the QDOS world, where multitasking is a
reality, a program like this is fairly easy to do.  Since
SuperBasic will not multitask, the end program has to be
compiled in some way.  For this article, I'll use Qliberator
to compile SuperBasic.

A background job is designed to be hidden and not appear
until it needs to.  This means that the program will not
immediately open any windows and only open them when
necessary.

When compiling this with Qliberator, be sure to turn the
WINDS option off.  The program will open it's own windows.
If you have WINDS turned on, the program will execute, but
you will need to do a CTRL-C to get back to QDOS.  If
anybody knows exactly what I'm doing wrong, please let me
know.

     100 job = Q_MYJOB
     110 QP job,128
     120 x = KEYROW(7)
     130 IF x = 20 THEN 
     140   BEEP 1000,10
     150   OPEN #3,con_50x50a100x100_32
     160   PAPER #3,0: INK #3,2: BORDER #3,4,2: CLS #3
     170   PRINT #3,"Hello"
     180   x$ = INKEY$(#3,-1)
     190   CLOSE #3
     200 END IF 
     210 GO TO 120


MICROEMACS LINE NUMBERING

I've been meaning  to tinker  around with  MicroEmacs macros
for some time, but never  got  around  to  it.   Recently  I
decided to take the time to really give it a try.  Of all of
the text editors available for the QL, I think MicroEmacs is
the most  powerful.  It's macro language  is the most robust
of the editors.  Both  QED  and  ED  have  macros  that  can
automate  keystroke commands, but they  don't have any logic
(IF..THEN)  or structure ( WHILE ) features.  MicroEmacs has
looping and logic controls.

As an  example, I thought that a  line number macro would be
nice.  The following macro goes to the beginning of the file
and starts  putting line  numbers on  each line.   Before it
does  this it queries you for  a starting line number, which
are are incremented in 10's.   To  determine  when  to  stop
processing, I had to know when the macro had reached the end
of  the  file.   Since  there  is  no  end-of-file  checking
mechanism,  I had to move to the end of the file and get the
line number of the  last line.   This was  then used  in the
while  loop.  If there are lots of empty lines at the bottom
of the file,  there macro  will number  them also.   A check
could  be put in the  see if the current  line is empty, but
this would  not work if a line had  only white space in it (
tabs and/or spaces).

I  noticed   two  differences   between  the   execution  of
MicroEmacs  and ED/QED macros.  One,  ED/QED macros are kind
of slow and take a while to run.  MicroEmacs macros are very
fast.  Total  run time for this macro  in an 20 line routine
was about 1-2  seconds.  Two,  when executing  ED/QED macros
you can  see what  is going  on as  it happens.   The screen
updates with each  command.   With  MicroEmacs,  the  screen
seems to  update only  at the  end of  the macro.   When the
macro went to  the bottom of  the file and  then returned to
the top, I thought it would display the movement, but it did
not.   If you do want to update  the dislay while a macro is
executing, there  is a  redraw screen  command that  you can
use.

The documentation for  the  MicroEmacs  macros  is  good  in
documenting the different  commands, but  it falls  short of
providing many examples.  I used other macros that came with
MicroEmacs to  learn from.  This can  slow down the learning
process,  but there is no other alternative.  In some ways I
use  this same technique in other languages.  I keep bits of
code  around so I don't have to memorize how to do a routine
in a particular language, I just go though my old code.

; Line Numbering Macro

set %line_num @"Starting Line Number? "

end-of-file
set %tot_lines $curline  ;LET tot_lines=line number @ EOF
beginning-of-file

!while &less $curline %tot_lines 
   beginning-of-line
   insert-string %line_num
   insert-string " "
   set %line_num &add %line_num 10;LET line_num=line_num+10
   next-line
!endwhile

beginning-of-file


ADDING CONFIG BLOCKS TO QLIB PROGRAMS

BasConfig is a utility, written by Oliver Fink, that creates
config blocks  for Qliberator compiled  programs.  For those
that don't  know, config  blocks are  extras chucks  of data
added to programs that are changeable by the user, using the
program "config".  In other words, if you have a program and
you  want the  user to  be able  to change  the size  of the
programs  window, you can  put the variables  for the window
size in  a config block  and let the  user configure anytime
they  want.  Config blocks are part of the executable and do
not interfere with the running of the program.  The 'config'
program  knows where in  the executable the  config block is
and knows how to change it.

Another way  of looking at the config  block is as an object
that has some data that is  used  by  your  program  and  is
separate  from your program.  In fact, until your program is
compiled, the config block is  a  separate  file  from  your
SuperBasic program.  This block is accessable from both your
program and the "config" program.

BasConfig  creates a file that has the config block and some
SuperBasic  extensions that allow the  program access to the
block.   These extensions need only be LRESPRed when you are
developing  your program.   They can  be compiled  into your
program and become part of the executable.

Before you  use BasConfig, you  need to define  what type of
data  you want the user  to be able to  change.  There are 7
different data types that are allowed in config blocks:

     String
     Long Word
     Word
     Byte
     Select
     Code
     Char

BasConfig does not support the  Long  Word  or  Select  data
types.  I don't have any documentation on config, so I can't
say exactly  what the difference is  between the types other
than what is obvious.

To  access the data in the config block, there is a function
for each  data type supported by BasConfig:

     C_STR$(n)   - String
     C_WORD(n)   - Word
     C_BYTE(n)   - Byte
     C_CODE(n)   - Code
     C_CHAR(n)   - Char

The  functions return the Nth data type in the config block.
If you  have two  CHAR's and  one STRING  data types  in the
config  block and  you wanted  to get  the second  CHAR, you
would do something like this:

     $var = C_CHAR(2)

If your config block  does not  have a  CHAR data  type, you
should be back some sort of error (I have not tested this).

To learn how all of  this  works,  I  created  a  SuperBasic
program that opens a window and displays the contents of the
two BYTE data items in a config block.  The example code is:

     100 REMark $$asmb=ram1_test_cfg,0,10
     110 EXT_FN "C_BYTE"
     120 OPEN #3,scr_100x100a50x50
     130 PAPER #3,0: INK #3,4: CLS #3
     140 item1 = C_BYTE(1)
     150 item2 = C_BYTE(2)
     160 PRINT #3,"Item #1 = ";item1
     170 PRINT #3,"Item #2 = ";item2
     180 PAUSE 500
     190 CLOSE #3

Note the  $$asmb directive that links  the config block into
the  program. It is BasConfig that creates this block, which
includes the  5 functions  to access  the config block.  The
EXT_FN command  tells  Qliberator  that  the  references  to
C_BYTE will be resolved at link time.

To create the config block, exec basconfig_obj.  The program
will  ask you for how many  different config items you want.
For  this example, I entered 2.  Next you are asked to enter
the name of your final program and its version number.  This
is used by the "config" program to let the user know exactly
what program  they are configuring.  These  to items can not
be changed by the user.

Now the program  will query you  for the data  types for the
first  data item. You can scrolll  through the data types by
hitting  the left arrow key.  I  scrolled over to "Byte" and
hit  return.  Since each data type is different the next few
questions will be different for each data type.  In the base
of the "Byte" data  type  the  items  were:  Initial  value,
Minimum value, & Maximum value.  The Min and Max values give
you control  of the  changes the  user can  make, so  that a
"bad" configuration can't be made.  For this example, I gave
the  first item a initial value of 10 and the second a value
of 20.

Once  I answered all of the  questions for the second config
item,  the program   asked for  a file  to store  the config
block.   It looks like the convention  for config block file
name extensions is _cfg.

Now,  the documentation for BasConfig is very sparse.  I had
to figure out how to get the data out of the config block by
reading the source code for BasConfig.  So, I have only done
just  enough to get a fair idea of  what is going on and how
to get it to work.

    Source: geocities.com/svenqhj