From Pickwiki
Revision as of 17:46, 9 January 2008 by Ian McGowan (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

HomePage >> SourceCode >> BasicSource >> Coding Standards
Users >> Rex_Gozar >> Coding Standards

Coding Standards

2007-11-29 by Rex Gozar


Coding standards are basic rules to help us catch bugs and improve the quality of the programs we produce. They help us in all phases of development by giving us stable and time-tested coding structures that work.

Coding standards are merely guidelines. They should be followed 99% of the time, and bent or discarded that 1% of the time when the standard adds confusion instead of clarity. That doesn't mean that the programmer has free will to code in any style whenever he wants; it means that the programmer must have sound reasons for not following the standard (and explain why in the code.)

The coding standards are based on the IBM Universe database. While most of the standards described can be applied to other Pick variants, the standard will not make explicit allowances for other environments.


This coding standard is not complete by any stretch of the imagination. Standards will be added or modified over time.


Coding standards are devised to help us in several areas:

  • Prevent common bugs
    Some types of bugs addressed in this coding standard include (a) conflicting attribute number usage, (b) improper operations on variables, and (c) unintended behavior.
  • Reveal business rules
    Like it or not, programs are the most precise representation of a company's business rules. We should write programs in such a way to express those business rules in a human-readable manner.
  • Manage complexity
    Programmers can only hold so much information in their heads at any one time. Coding standards help compartmentalized program instructions to aid programmer comprehension. Coding standards help to reveal a natural structure and logical flow.

This document is NOT meant to be an exhaustive discussion weighing the pros and cons of different coding practices. If you 're really serious about this kind of stuff, the book Code Complete by Steve McConnell is the authoritative text on the subject; get it from your local library or bookstore.


A word about programming "style".

The term "coding standards" used in this document refers to what some people call "coding style guidelines". And this differs from what a programmer might call his or her "individual coding style".

In some cases, the standard presumes to take precedence over individual style. The reality is that the programmer still has to make hundreds of decisions about how to write the code. This leaves plenty of room for a programmer to express his or her "style".

1. Variable and Constant Naming

Variable and constant naming standards are designed to provide semantics for program data. Constants, whether literal or equated, should be used to prevent unintentional resetting of values, and seeing ESC in a program is easier to understand than CHAR(027).

1.1. All files have a standard, unique 2-4 letter abbreviation.


1.2. All program variables that relate to a file use the abbreviation as part of its name. The suffix ".ID" will refer to the record identifier. The suffix ".ITEM" will be used to refer to a dynamic array record. The suffix ".REC" will be used to refer to a dimensioned array record. The file abbreviation and an attribute name refers to an extracted data value. Dots (.) will be used to delimit words.

	CM.ID	refers to the id used to read from the CUST-MAST file.
	CM.ITEM	refers to the entire record READ from the CUST-MAST file.
	CM.REC	refers to the entire record MATREAD from the CUST-MAST file.
	CM.NAME	refers to name data extracted from or replaced into a CUST-MAST

This way, wrong code will look wrong, as illustrated in the following examples. More importantly, a programmer would not type in the mistake in the first place.

        * RIGHT *

        * WRONG *
        WRITE POH.ITEM ON PO.XREF, POH.ID  ;* variable prefixes don't match the file's abbrev - WRONG!
        CM.REC<CM$NAME> = TEXT             ;* "<>" replace does not work with a dimensioned variable

        * ??? *
        WRITE RECORD ON PO.XREF, ID      ;* could be right or wrong -- can't tell
        RECORD<4> = TEXT                 ;* must read more code to determine correctness

1.3. All files have a standard include file that defines all the attribute constants. Each attribute (i.e. field number) has a single, unique attribute constant. Note that dollar signs ($) are used in the attribute constant (to denote that it is an equated constant.) An attribute name should also give an indication of the data type: ".AMT" for amounts, ".DATE" for dates, etc.


1.4. Constants or variables that have "DATE" in the name are assumed to use the standard date conversion (like "D2/"). Likewise, names with "AMT" or "AMOUNT" are assumed to use the standard amount conversion (like "MR2").

1.5. Booleans should have "FLAG" or "SW" (for switch) in their names. Boolean values are either @TRUE or @FALSE.


1.6. Associated multi-values should have associated names.


1.7. Equated constants are defined in the INC file. Constants use dollar signs ($) to delimit words. Semantically, the dollar sign tells us that the variable is a constant, and it's defined outside of the program. Optionally, a variable may start with "EQU$" when it's defining an expression, or "LIT$" when defining a literal string.

	EQU MSG$REQUIRED TO "data required"

1.8. File attribute constants are defined in INC with the record id matching the actual file name.


Field numbers in INC files take precedence over dictionary items. For example, if the INC file equates IM$DEALER.PR to 28 and the dictionary item DEALER-PR shows a different attribute number, the INC file is correct and the dictionary should be changed to attribute 28 (why? only programs write data, so they take precedence whether you like it or not.)

1.9. Constants for programs may be defined in INC with the record id matching the program name plus ".EQU" or ".H".


1.10. Commonly defined constants are already defined in STD.EQU and should be included in all programs.


1.11. Common area definitions should be defined in INC with the ".COM" or ".COMMON" suffix.


1.12. Equated constants should be used in programs to extract data from records and control processing. Hardcoded numbers (a.k.a. "magic numbers") should never be used in programs. Constants make it possible to search for file attribute reads/updates.


Again, wrong code should look wrong. Here's a real-life example; I don't know how many years the following bug was in production. Unless you were trying to modify these particular lines of code, you would probably just glance at them and assume they were correct. Of course, because I'm using this as an example you have probably spotted the bug at first glance -- or did you?

        * programmers assume the numbers are correct *
        LN.ID = OH.REC(10)<1,POS>
        IF LN.ID = "" THEN
           PROFIT = OH.REC(11) + OH.REC(72) - TCOST + OH.REC(13)  ;* includes parts warranty adj
           IF OH.REC(11) THEN
              PCT.PROFIT = INT(((PROFIT * 1000) / OH.REC(11)) + .5)
           IF LABOR + 0 = 0 THEN
              LABOR = OH.REC(17)

Here's the same code using attribute constants. Now can you see the bug?

        * constants make the bug obvious *
        LN.ID = OH.REC(OH$LINES)<1,POS>
        IF LN.ID = "" THEN
           PROFIT = OH.REC(OH$TOTUNIT) + OH.REC(OH$MILEAGE) - TCOST + OH.REC(OH$DISC)  ;* includes parts warranty adj
              PCT.PROFIT = INT(((PROFIT * 1000) / OH.REC(OH$TOTUNIT)) + .5)
           IF LABOR + 0 = 0 THEN
              LABOR = OH.REC(OH$LABOR)

Hint: MILEAGE usually isn't added into a PROFIT total.

Even if a junior programmer was only skimming the program for errors, he would probably catch this bug. In fact, had the programmer used attribute constants he never would have coded this bug in the first place. When you use hardcoded numbers, you can never be 100% sure all the other programs are using the same numbers in the same way.

1.13. Include files must have $IFNDEF name.DEFINED logic. This prevents double definition of equates.

	   * EQU statements *

1.14. Use scratch variables for holding temporary values, and use obvious names like TEMP, SCRAP, or XX. Scratch variables should be used within small blocks of code without any GOTO's or GOSUB's so they don't get reset unexpectedly. If a scratch variable is used in a block that GOSUB's to another section of code, it should be given a meaningful variable name.

2. Statements, Labels, and Flow Control

Consistent coding structures provide predictable, defined behavior. They also modularize the code, breaking down complex problems into simpler components, and making the flow of control obvious.

2.1. Lines should not exceed 70 characters. Inline comments are the exception. Shorter lines are usually easier to read and debug.

2.2. Use PRINT for printing, CRT or DISPLAY for displaying. This makes it clear what you intend the output to be.

2.3. File OPEN's should ABORT on the ELSE clause. All files should be opened to variables of the same name (as best as possible.) Abbreviations should not be used. ELSE clauses should always have ELSE ABORT 201, "filename".


Why abort? Because 99% of the time, there is no good reason for a file not to exist -- it is almost always the result of a programmer's error. We don't want to hide these errors.

2.4 Multi-statement lines using the semi-colon (;) should be avoided. IF-THEN, READ-ELSE, and other similar statements should be broken into multiple lines to aid readability.

	* BAD *

	* GOOD *
	   ENTER @W7

2.5. IF-THEN, READ-ELSE, and other similar statements should terminate with END.

        * BAD *

        IF FLAG

        * OK *
        IF FLAG

        * BETTER *
        IF FLAG THEN
        END ELSE

2.6. Conditional IF-THEN statements should be coded using "positive value testing". Never use IF-ELSE.

	* BAD *
	IF NOT(UNUSEDSW) THEN    ;* double negative adds confusion
	IF UNUSEDSW ELSE         ;* unexpected and easy to miss

	* GOOD *
	IF USEDSW THEN           ;* recoded to a "positive value"

2.7. Booleans should be used as a complete expression, not as a variable for comparison.

	* BAD *
	   * do some processing *

	* GOOD *
	   * do some processing *

2.8. Use the NOT function to toggle boolean variables; you don't need to use IF-THEN-ELSE statements.

	* BAD *
	   PSW = @FALSE
	   PSW = @TRUE

	* GOOD *

2.9. Break up complex conditions into manageable pieces. Better yet, try to use meaningful variable names to reveal the business rules that dictate the processing.

        * BAD *
           GOSUB 7500

        * GOOD *

2.10. Avoid using obscure and unfamiliar coding styles. Don't do this:

	* BAD *

	* GOOD *
	   MAX = (VALUE+1)

2.11. Avoid using GOTO. GOTO's should be used to skip large sections of code, not for local flow control. Small loops should use LOOP-REPEAT or FOR-NEXT statements, with EXIT or CONTINUE for flow control. Note that CONTINUE will always go to the top of the loop, so tests for exiting out of a loop should be reachable.

	* BAD *
	220:  XX = XX + 1	
	      VALUE = TABLE<1,XX>
	      IF VALUE = "" THEN GOTO 230
	      * do some processing *
	      GOTO 220
	230:  BLAT = "something"

	* GOOD *
	         XX = XX + 1
	         VALUE = TABLE<1,XX>
	      UNTIL VALUE = "" DO
	         * do some processing *
	      BLAT = "something"

GOTO's are bad because they hide the natural structure and flow of a routine. In the previous "BAD" example, it's hard to see the beginning and ending points of the loop, while the "GOOD" example is obvious that the code is looping.

2.12. LOOP-REMOVE-REPEAT works faster than EXTRACTing or FIELDing data, and should be used for small loops.

	CODES = CODES   ;* resets the internal pointer
	   * do some processing *

2.13. LOOP-WHILE READNEXT-REPEAT is simpler than using the ELSE clause to set variables for flow control.

	   * do some processing *

2.14. Labels for GOTO's and GOSUB's should consist of words, not numbers. Further, labels should reflect an action performed on an object (e.g. DISPLAY.MSG: not 9000:.)

Some programmers propose to have both numbers and words in labels, e.g. P600.GET.PRICES. While this might seem like the best of both worlds -- having numbers to help find the routine in the source code listing and words to determine function -- I feel that the numbers make the program more rigid, which discourages simplification and refactoring.

Labels should be on a separate line to aid readability –- remember that GOTO/GOSUB should be used to skip to another section of code, not for local flow control (i.e. jumping out of a loop.)

Labeled routines can be ordered either alphabetically, by order of appearance, or by "locality". "Locality" refers to keeping subroutines close to where they are used; this makes it easier for other programmers to correlate functionality between routines.

2.15. CASE and ON-GOTO/GOSUB statements should be used to direct program flow where applicable, instead of multiple IF-THEN statements.

2.16. Avoid using RETURN TO statements. Either use GOSUB-RETURN or GOTO logic. Using RETURN TO breaks the modularity of the routines and creates unpredictable behavior. Instead of using RETURN TO, restructure the routines to use RETURN.

2.17. Application programs should never use PRINT, DISPLAY, or INPUT statements directly. All I/O between the user and program should be controlled by I/O subroutines. It's too easy to forget to code for all the possible entry/presentation scenarios for every piece of data. Subroutines provide modularity and give the means to change behavior across the entire product.

2.18. Don't use DATA statements to pass parameters to programs, as this screws up any possibility for automated testing.

2.19. Nested statements should be minimized; no more than two levels. Past that, programs become harder to comprehend and maintain; restructure deeply nested logic.


3.1. Generous use of whitespace makes programs easier to read, and makes blocks of code stand out. Use extra blank lines instead of *'s, which clutter up programs.

3.2. Avoid "noise" comments. Comments shouldn't simply restate what a person can figure out by reading the code.

3.3. Do include "milestone" comments, which help programmers find the beginning of major sections in your code.

3.4. Do include "purpose" comments to explain "why" more so than "what" a section of code is doing. Each block of code should have a comment if the purpose of the block isn't obvious. Ask yourself, "will the purpose of this code be obvious to me 6 months from now?".

3.5. Unused code should be removed, not commented out. Source code control keeps track of the code that changes between program revisions. Manually keeping track of lines added, modified, and deleted is tedious and error prone. The only times when it's acceptable to comment out code is when the code will be reactivated in the future, or when it's documenting an old method or process that was refactored (i.e. we used to do it this way, but now we're doing it better, so don't try to do it the old way again.)

3.6. Comments in the program header describing modifications are acceptable, but not necessary. All changes should be documented in the appropriate release notes document for the base product or customer release. Header comments tend to be ignored until something goes wrong, while release notes tend to trigger better documentation, training, etc.

3.7. Do include "*TODO*" comments to let people know that you're not an idiot and you realize more code needs to be added. It serves as a reminder and the next programmer may implement it for you later.

	*TODO* add validation logic for fax number.

3.8. As programmers, we write and maintain comments almost as well as we read them. Comments must be kept in sync with program changes. It is important to review all the comments when making program changes and keep them up to date.

4. Record Locking

4.1. All programs that update files must lock records. All statements that lock records must have a LOCKED clause.

4.2. Interactive programs should attempt to lock the record, and display a "locked" message if the record is already locked and allow the user to access another id – users should never be in limbo waiting on a record lock.

4.3. Batch programs should attempt to lock records and timeout after a specified period of time, skipping over the record. Batch processes should never be blocked on a locked record. "Don't block on a lock."

5. Subroutines

Subroutines and functions should be designed with a specific "contract" in mind. In practice this means that the calling program promises to pass specific values to a subroutine, and in return the subroutine promises to return specific values after performing its defined function. This contract should be enforced by using equated constants specific to the subroutine. The subroutine should also validate all variable arguments to be within the proper range of values. Subroutines should always return an error code in the event that an error occurs.

5.1. Never break the subroutine contract. You never add or remove arguments from a subroutine call or change their type. Doing so would break the modularity of the system. If you need to add/change/remove arguments, create a new version of the subroutine, e.g. MAIL.SUB would become MAIL.SUB2(SERVERNAME).

5.2. "Bugs" and "error conditions" are different. Programming defects (bugs) should ABORT, loudly crashing the program to make bugs easier to spot and fix during testing. This is different from error conditions (errors), where the data is valid and within range, but cannot be successfully processed. Errors don't crash the software, but are returned with meaningful data.

For example, opening a file for reading/writing should ABORT if the OPEN statement fails. There's no good reason for a file not to OPEN during normal usage, and failure to open is usually caused by a programmer's error. This type of mistake is easier to find and fix when the program loudly ABORTs. Similarly, missing or invalid arguments to a subroutine should cause an ABORT in the subroutine, so bugs in the calling program can be fixed. Conversely, if the subroutine quietly ignores or fixes the arguments, buggy data can propagate into other routines and even into data files. Bugs are unexpected, so we set "traps" in the most likely places to catch them.

Error conditions are caused by expected conflicts in processing data. For example, we expect users to make mistakes in data entry, so we diplomatically tell the user their mistake and repeat the prompt quietly. Or the network is down causing the user's request to fail, so we display a short message of the condition to the user and allow them to retry later.

5.3. All external subroutines should return ERRDATA and ERRCODE variables. ERRCODE should be numeric, with 0=SUCCESS, so a program can test IF ERRCODE THEN... ERRDATA should at least return an error message string; otherwise, it can contain an entire dynamic array with multiple data values. ERRDATA and ERRCODE are never used to pass "good" values back to the caller (also, never pass error data back to the caller using "good" variables.)

5.4. All external subroutines should have specific input-only and output-only arguments. Subroutines with a bunch of arguments (i.e. requiring the SUBROUTINE header line to wrap) should encapsulate the arguments into a static or dynamic array structure. A structure should also be used in subroutines that process a variable number of arguments.

5.5. Subroutines should have assertions to make sure their input and output arguments are valid and match the contract. Assertions are checkpoints programmed into the software to ensure variable data "looks" the way we expect it to. When the data is different from our assertion, the program ABORTs, loudly crashing and telling the programmer what's wrong. This methodology is the opposite of defensive programming, where programs hide bugs by fixing mistakes from other programs. The goal is to eliminate bugs at their source, and defensive programming works against this goal.

        EQU ASSERT LIT " IF "    ;* no-op by changing " IF " to " *IF "

Note that assertions always test for positive conditions, and fail when the condition is not met.

5.6. Input-only arguments should never be modifed by the subroutine. They should be validated to ensure the proper data type and range of values. If the input-args are not valid, the program should abort. This is a programming error -- remember the contract, the caller promises to pass valid data, so this is a bug in the caller.

5.7. Variables passed to subroutines can be passed by value, instead of by reference, to protect their value. Enclose subroutine variables in parentheses to make sure they don't change.

	CALL IN.AIFM((IM.ID))    ;* extra parentheses keeps the
                                 ;* value of IM.ID from changing

5.8. Subroutines and functions should have the fewest valid exits possible. Obviously, if an error occurs, processing should stop and the subroutine returns. Normal logic flow should proceed from the top of subroutine to the final RETURN statement.

Some programmers argue that a routine should have "one entry, one exit". While that makes sense in some situations, it doesn't make sense in all. This should really be considered on a case-by-case basis. Routines have a natural structure and flow, and if the flow gets disrupted by an error condition (e.g. network outage) then the routine should exit rather than continue down the flow.

5.9. Try to segregate MVC (model-view-controller) functionality. Business rules logic should be separated from user interface processing as much as possible.

6. Programs

6.1. Have a "zero defect policy". This means that all bugs and non-standard code are corrected and tested before new code is added to a program. The product must be ready to ship at all times. All programmers are responsible for maintaining product readiness.

6.2. Programmers, not testers, are responsible for finding all bugs in the product. Only programmers have the expertise and knowledge to hunt down data and programming errors. Testers are responsible for ensuring that the product functions according to specification, and suggesting changes in functionality to build a better product.

6.3. Test programs and data will be devised before writing new features. These items will be added to the testing framework along with the new features to complete the automated test suite.

6.4. All base-product programs reside in the PROG file. All customized programs reside in the CPROG file.

6.5. Both PROG and CPROG will only contain 100% working programs -- no program fragments, no backup copies, no equates, no common areas, and no documentation. Every program in should compile cleanly and can be cataloged as an executable program.

6.6. All program changes are documented in a "release notes" document. Each product release will have its own unique document.

7. Printing, Faxing, and Email

7.1. Print programs should use table or record-driven parameters for customizing output. Individual customers will have their own unique parameters, but the logic for printing and database access will be standardized.

7.2. Print programs do not update files. The system should allow a report/invoice/statement to be run numerous times without any "undo" logic.


2007-11-29 by Rex Gozar

Please leave feedback here :)

2007-11-30 by Ian McGowan

A piece of meta-feedback - this seems like a coding standards and a programming philosophy document met in a bar late one night and produced this strange offspring. E.g. having detailed variable naming conventions mixed with a zero defect policy statement seems unusual. It might flow better to have a strict "rules" document that must be obeyed, and then a "guidelines" document (or addendum to the rules, since the rules are 90% of the document) that talks about some of the fuzzier areas. The fuzzy areas are those that apply to any language, not just PICK basic. A small quibble :)
1.1 seems to contradict 2.3 - why not use the abbreviation for the file name too? And doesn't 2.3 belong in section 1 anyhow?
The example in 1.2 uses PO. as the abbreviation, though the standard above says to use POH. :-)
I don't like the idea of using $ in a variable name, but don't have a good reason why not, or an alternative even, so I shouldn't bring that up. Is it because the code starts to look like perl?
It's really interesting to see some things written down which I've been doing without (conscious) thought. Seeing this list gives me several ideas for changes to how I do things, and definitely some topics to discuss with the other basic programmers I work with.

2007-11-30 by Rex Gozar

Having its start in-house, it was simpler to combine all my thoughts on coding into a single document. So yes, conventions and philosophy are mixed together here, probably more as a teaching aid for our staff.
As for variable names for files, I would find using the abbreviation as part of the variable name acceptable too, e.g. CUST-MAST "CM" would have the variable name "CM.FILE", or "CM.DATA" vs. "CM.DICT" -- sure, I'd go for that. Again, this was based on our in-house legacy application, and it was already implemented in most of the programs.
In our standard here, the "$" always denotes a constant, not a variable. In the spirit of "making wrong code look wrong", I should never see any assignments to variables with a "$". I would also expect to see the "$" constant used everywhere a record attribute number would be used. In C, the convention is to use an underscore before and after the constant, e.g. _MAXROWS_, but Universe does not like leading underscores.
To everyone else: should the "rules", "guidelines", and "philosophy" be split into different documents or sections?

2007-12-20 by Brian Leach

Rex this is a great start. But it needs to be split out to get wider contributions.
It is also old style in some respects, and there are a few errata:
  • REMOVE isn't quicker on UniVerse, it has a hint mechanism that makes up for this.
  • Don't use ABORT 201. That's fine in a green screen environment, but not for a modern system. The abort message won't get passed back to a calling client.
  • Don't use static arrays with subroutines. They can't be passed through UniObjects or any other API.
  • The bit about 2 digit prefixes is way too site specific. When there are hundreds of files, that just won't work - but the idea of a prefix ignoring length is a good one.
  • No limit on line lengths really, please. Some of us escaped from green screen editors decades ago.
  • No numeric labels. They belong in the dustbin of history. Use meaningful labels that explain what the section does, so you can parse quickly through code.
  • Mixed case coding, mixed case coding. There are good reasons why the whole rest of the world uses it. It's faster to scan and read (once you get used to it) and easier to identify errors. Even UniData has emerged from the darkness with that one.
  • You don't mention external functions. Use these - they give compile time argument checking that is worth the overhead in declaration.
  • Your comment on record locking only works in session based programming. Not for web services or other connectionless processing.
But, a great start.

2007-12-20 by Rex Gozar

Yes, it should be broken into different sections to encourage contributors to add relevant comments and arguments.
As for REMOVE, it does appear to work faster on values and sub-values; field extraction using <amc> syntax takes advantage of the hint mechanism (has anyone tested this to really see if this is true?)
ABORT'ing on problems that should never happen is a good thing. I often see fancy error messaging on file opens, but it's really unnecessary. File opens usually fail because the programmer forgot to create the file or pointer when they installed a new mod. ABORT'ing is an effective technique in finding such "one time" errors quickly, in both green screen and web service applications.
Having a 2-4 letter mnemonic or prefix is probably too site specific, but the idea is that there should be some simple semantics devised to help a programmer easily relate file constants and variables thoughout the entire application. Even here, we have some prefixes longer than 4 letters because we wanted to make them more distinctive.
Shorter line lengths are meant to help junior programmers' comprehension. Unless you're using really long variable names, a long line of code may indicate overly complex logic, or at best poor cohesion (e.g. combining business rules calculations with view and controller logic.) Of course, some BASIC statements just end up being long... these are just guidelines.
Even in the stateless world of the web, it still isn't a good idea to "block on a lock". Without the interactivity of green screens, we don't want web services to hang indefinitely on a record lock. A better strategy is to retry a reasonable number of times, then log the error and report the condition back to the web service for appropriate handling.
Whatever standards you devise [for your site], it's important to make the effort to get ALL programs to conform to YOUR standard.

2008-01-09 Ian McGowan

The suggestion to get ALL programs to conform raises an interesting question - in the Java world we use cruisecontrol or hudson in conjunction with junit, checkstyle, findbugs and other analysis tools to do automated builds and tests. What would a similar framework look like in the Pick Basic world?