您的位置:首页 > 博客中心 > 数据库 >

PLSQL Coding Standard

时间:2022-03-14 01:15

Naming and Coding Standards for SQL and PL/SQL

"The nice thing about standards is that you have so many to choose from."

- Andrew S Tanenbaum

Introduction

This document is mentioned in . One of the first comments begins:

I have had a look through the document and it is mainly concerned with making code neat and maintainable and NOT with writing efficient code. Although the content is commendable and well intended, I have never actually encountered a site where such an approach succeeds in practice.

and concludes:

If all my predecessors had just used the following two standards my life would be much easier.

1. Give EVERYTHING meaningful names.

2. If anything is not obvious, change it so it is, and if you can‘t, explain it with comments.

And of course the poster is quite right. It is unproductive to tell everyone, in minute detail, how to use each language feature and how to lay it out, and nobody will follow that type of instruction anyway. On the other hand, the chaos that ensues when nobody cares also leads to unmaintainable, inefficient and hard-to-read code. The idea of this document is to set out some guidelines that at least encourage developers to think about how code should ideally be standardised.

Why have standards?

Clearer Understanding

Whether you are amending an existing program or writing something new, clearly structured code that tells you what kind of object you are dealing with and how it fits into the structure can do some of the thinking for you.

Bugs that were hidden in a cryptic jumble can just leap out at you when the code is laid out logically. Often when facing a debugging problem in poorly maintained code, the first step is to straighten out the layout, allowing you to see how it can be rationalised. (For example, look at the query shown in , then scroll down to see the formatted version. That final nested subquery was a surprise to me as well.) Often just understanding what the code means is halfway to solving a programming problem. Clearly, not understanding what you are working on is likely to prove a hindrance to efficient programming. Like those "De-clutter Your Life" makeover shows on TV, the result is a more efficient, pleasant and relaxing place to be, where you can easily reach for the things you need without tripping over the toaster. So THAT’s the structure, you think as you see it in its new untangled format. This is called and it should be carried out ruthlessly.

Better Code

Simply using clear names for objects, and laying out code so that the structure is easy to follow, should reduce and result in better-structured modules. It will be easier for others to see how the code works, and therefore modify it as necessary without increasing , which occurs when the originally intended design of a module is eroded by subsequent changes. The entropy accelerates as the code becomes harder to understand. You don‘t have time to figure out how all the sprawling loops and confusing variables interact with each other, so you paste in another ELSIF and hope for the best. Eventually it becomes impossible for anyone to figure out with any confidence how the code really works and you have what is known as a or .

Fewer decisions to make

Having guidelines to follow means that there are some decisions that you do not have to make. Naming objects, for example, becomes simpler. You also do not have to justify, argue and defend every decision you make.

Easier to use

Everyone should find it easier to navigate around the database, construct queries that work as expected and use shared utilities, if they are built and named in a consistent way.

Easier Maintenance

Maintenance programming becomes easier since the code is easier to predict, read and modify. Changes are easier to implement and are more likely to work first time, making them faster and cheaper.

Why standards are tricky

Reading about all the great advantages, it seems straightforward: everyone should have development standards and apply them consistently. It is worth considering some potential obstacles and philosophical limitations, however.

Defining a set of rules that is both complete and consistent is, for one thing, impossible. Persuading developers to agree to them in principle and follow them in practice are further challenges. Then you just have to convince QA staff to enforce them and management to take them seriously.

Picture a code walkthrough. The developer presents a new module which has been unit tested and documented. The business analyst confirms that it complies with the requirements. The tester confirms that it matches the documentation. The support staff are satisfied that it won‘t do any damage if they install it. Deadlines are tight. Are you really going to point out that it doesn‘t line up, uppercase and lowercase are mixed up randomly, there are numbers used as booleans and a cursor named ‘c1‘? And perhaps that Cursor FOR loop wasn‘t the most efficient solution, now you look at it - was the developer not aware of multi-table Inserts? Perhaps in the scheme of things it‘s not that important. Maybe it‘s too late now anyway. You don‘t want to seem like a pedantic nitpicker who gets in the way, and who is to say what‘s right anyway? Maybe you should just throw those coding standards in the bin.

Or let‘s say you are a developer, and management has decreed that for ISO9000 compliance there should be a documented set of standards in place. Someone with limited programming experience has been given the task of putting something together and has downloaded something off the Internet. It is full of arbitrary rules that make no particular sense and some of it, you‘re sure, is plain wrong. The manager has not read it and never will. It is 30 pages long and poorly formatted. You quietly ignore it.

Or imagine a system that has been thrown together without much care by developers who have now left. It is a mess of spaghetti code and elaborate workarounds for problems of its own making. You need to make a small change, but while editing the code you are horrified by what you find, so you straighten things out a little. At the code review however, you find yourself in trouble for making unauthorised changes. Testers (who have run a diff report against the previous version and found three hundred changed lines) complain that you have landed them with extra work, support staff are concerned that you might have broken something that may have been done that way for a reason, and management think you have wasted time on cosmetic fiddling, while also pointing out that maintenance and new functionality come out of two separate budgets.

So what is it that we really want?

Standards in practice

We should first accept some philosophical limitations:

  • Completeness is impossible.

    To cover every possible coding situation would take a huge reference manual, and let‘s face it, no-one would want to read it, so a document like this can never attempt to be complete.

  • Consistency is impossible.

    A set of standards can never be fully consistent. For example, consider the rule "All keywords and reserved words should be typed in uppercase". So you write TO_NUMBER and TO_CHAR - so far so good. But what about your own functions IS_NUMBER and BOOLEAN_TO_CHAR? Either you write is_number and TO_NUMBER, or you stretch the definition of ‘keyword‘ to include those keywords you defined yourself. But now look at the declaration:

    CREATE OR REPLACE FUNCTION IS_NUMBER...

    ...next to...

    CREATE OR REPLACE VIEW sales_summary...

    Well, maybe that‘s correct according to our rule, but it feels inconsistent. You might argue that the ‘IS_NUMBER‘ in the function declaration above is syntactically just a label at this point, so actually that should be:

    CREATE OR REPLACE FUNCTION is_number...

    And if procedure and function names are keywords, what if they are in packages? Should that be DBMS_OUTPUT.PUT_LINE or dbms_output.PUT_LINE? (At least this matches mysequence.NEXTVAL and myobject.METHOD.) If the former, we will have to write MYPACKAGE.MYFUNCTION but mypackage.myvariable. Meanwhile, the jury is out on whether USER, LEVEL and the like are functions (uppercase) or pseudo-columns/system variables (lowercase).

    In the end, all you can do is try to be reasonably consistent.

Be flexible

There needs to be some flexibility and consensus built into the standards. There is nothing worse than arriving at a site and being told that you must place each semicolon on a new line, or some such nonsense, apparently just for the sake of it. For these reasons it is sometimes argued, with some justification, that all corporate development standards are stupid.

In this document you will notice there is often both a formal standard and an informal one. By formal, we mean following to the letter every single rule for qualifying object names and laying out code. However in practice doing all of this in every case can be excessive, and actually a distraction from the program logic, not to mention a burden for the programmer. The informal version can therefore be regarded as the minimum standard. It is up to you to decide in each case which to choose. For example, the formal rule for calling procedures is that each argument should appear on its own line below the procedure name: but if there are only one or two short arguments it is simpler to put them all on one line. However what you don‘t do is put them all over the place without bothering to think about it.

File naming conventions

Maintain your code in text files, rather than editing directly on the database using TOAD™ etc. This gives you version control, as well as protecting you in the event of your stored module being modified or dropped by another user.

For production code, define package specification and bodies in separate files. This gives them independent version control, and also makes it easier to install a new version of a package body with minimal impact on other schema objects. (The same applies to types.)

The choice of extension will be influenced by your editor and IDE defaults, and other installed applications that may have laid claim to common file extensions: for example on Apple Mac, ".pls" is by default an iTunes playlist. The following extensions are suggested:

File TypeExtension
Package (spec, and optionally body) pkg
Package Body pkb
Procedure prc
Function fnc
Object type (spec, and optionally body) typ
Object type body tyb
Stored Java Procedure sjp
Stored Java Function sjf
Trigger trg
Any other PL/SQL pls
Any other SQL sql

Identifier naming conventions

Identifiers can be thought of as consisting of five possible parts: <Scope><Type><Primary Identifier><Modifier><Suffix>. Of the five elements the primary identifier is the most important. All others are optional and only make the name more self-documenting, or clearer in cases where it could otherwise be confusing.

For example, although the full formal specification might be gv_customer_id (indicating a global variable), in practice the simpler g_customer_id will generally do fine and should be used unless it is unclear or ambiguous. This goes for all variable and type prefixing - sometimes including too much information in the name can actually obscure its purpose, and so there is often a case for using e.g. address instead of g_address_rectype. Use the minimum that is still clear.

Identifier elements

Scope

Scope refers to where the identifier is defined and where it can be used. If you are debugging some code it is very helpful to know whether a particular value is global, and could therefore have been changed by anything in the entire system, or whether it exists only within the current procedure.

g is global, i.e. defined at package level (either public or private), rather than within a subprogram. This is almost always worth including in the name of a package variable.

l is local, i.e. defined within the current procedure or function. Generally the absence of a g_ prefix alone will make this clear, so this is one to keep in reserve until you really need it.

p indicates a parameter, telling us not only that the scope is limited to the current procedure or function, but that the value was passed in as an argument. This information is essential.

Type of identifier

The two kinds of identifier for scalar values are constants (k) and variables (v). You could break these two down further into the individual data types such as VARCHAR2 and DATE, but then it gets complicated. There are only so many pieces of information about an identifier that it are really worth repeating every time you refer to it, and in any case we want to travel light.

TypeDescriptionExampleComment
k Constant k_mailhost Generic constant
g Variable g_loop_count Global (package-level) variable
v Variable v_loop_count Local variable

Note that we have collapsed gv_ to g_, and gk_ to k_ to avoid burdening the names with too much information. g_ says global variable unless there is other information to the contrary. And ninety nine times out of a hundred it does not make a difference to the next programmer whether a constant was defined at the package level or locally. Its value isn‘t about to change after all.

In addition to these scalar types there are aggregate data types, as follows:

TypeDescriptionExample
c Cursor c_employees
cp Cursor parameter cp_employee_id
r Record r_employee

In addition to the built-in types are programmer-defined Types and Subtypes. Note that:

  • The purpose of using a subtype is to give a clear, centrally defined, self-documenting name to a technical definition, which you can alter later (should you need to) with the minimum effort and code editing. There is no point in declaring something like subtype emp240_rtyp c_emp240%rowtype (a record type based on a cursor) - if anything, this just makes the code harder to follow, since emp240_rtyp is now yet another thing the next developer has to figure out and remember. A better use for subtypes would be something like subtype money is stock_items.unit_price%type.
  • In defining our own datatypes we are essentially extending the language. Since the standard datatypes (INTEGER, BOOLEAN) do not require indicator prefixes, why should ours? There is therefore often a case for using a logical name such as address rather than the formal g_address_rectype, unless doing so is unclear or ambiguous.

Primary Identifier

The primary identifier is the most important part of a name. It should succinctly indicate the meaning and purpose of the named thing. This can be a single word or a phrase, with optional modifiers (above) added to give further descriptive information where necessary.

There is often a trade-off between length for documentation and brevity for typing purposes. Think carefully about each name and the object it represents. Usually if the first name you think of seems too long, you can make it shorter by thinking of a simpler way to describe it, rather than the traditional approach of using the first rambling jumble that pops into your head and then removing half the vowels, to give something like ODI_LOE_BY_AC_PD_L1 (the name of an accounting table, apparently - from an actual posting on Metalink). And is flgtdt really an improvement on flight_date? What about function fs_db_host_nm? - is nm short for name, number, or something else? (And why db anyway - what else would it be?) A good general rule is therefore:

Avoid abbreviations, unless they are already familiar terms.

We can follow that up with:

If the name is too long, think about it some more. You can do better.

Suffix

The suffix is used to qualify the identifier further by documenting its usage. For example, the suffix denotes the type of parameter, whether IN, OUT, or IN OUT, or to show that a variable or parameter is also a table.

TypeDescriptionExample
  Input-only parameter p_num_items
out Output-only parameter p_sum_out
inout Both input and output p_sum_inout

Now that the basic approach to naming is defined, we‘ll look at how this is used in practice.

Identifier naming examples

Cursor Declaration

Cursors are usually named after the table or view that is being processed. Prefix cursor names with c_. You may still specify the scope of the variable as usual, if you need to.

Note that a cursor defines a set of records, so it should generally be named in the plural, e.g. c_customers (or possibly c_cst if you have a strong system of 3-letter shortnames). It does not ‘get‘ anything.

c on its own may occasionally be clear (for example if it is the only cursor in a very small block of code), but never c1. Naming a cursor c1 is both an admission that there may be other cursors, and a feeble attempt at providing an extendable naming system. If there is ever likely to be a c2, it means you should have taken the trouble to name it properly in the first place. And you would not normally want more than one cursor in a procedure anyway, when you could either combine them into one more efficient query or create two better-modularised procedures.

ScopeTypePrimary IdentifierModifierExample
Local cursor Account New c_new_accounts
Parameter ref cursor passed as parameter Account Old pc_old_accounts

Record based on table or cursor

These records are defined from the structure of a table or cursor, and so should reflect this in the name. Note that while a cursor defines a set of records, a record is by definition just one and so should be in the singular.

ScopePrimary IdentifierExample (formal)Example (informal)
Local Account lr_account r_account
Parameter Account pr_account p_account
Global Account gr_last_account g_last_account

If you have more than one record declared for a single cursor, preface the record name with a word that describes it, such as:

ScopePrimary IdentifierModifierExample
Local Account new r_new_account
Parameter Account duplicate p_duplicate_account
Global Account old g_old_account

Loop index

Generally a very simple name such as i will be perfectly clear since, unless there are inner loops, there can only be one such index variable:

FOR i IN 1..12 LOOP

However, it may sometimes be preferable to give a more meaningful name, or to give it a formal prefix:

FOR i_month IN 1..12 LOOP

The same applies to cursor loop records. Often r will be perfectly clear, or else follow the convention described above for records. (If you are tempted to use r1 though, it means you should think of a proper descriptive name.)

FOR r IN c_discrepancies LOOP
FOR r_emp IN c_emp LOOP

Collection (PL/SQL table) type

In order to create a collection, you must first first declare a TYPE. I use _tt for collection ("table") type.

Generally you should be able to reuse a generic collection type for multiple variables, e.g. v_mailing_addresses and v_invoice_addresses might both be declared address_tt.

type account_tt is table of accounts%rowtype;
v_accounts account_tt;

Also note that there are several different collection types in PL/SQL: associative arrays (Index-By tables), varrays and nested tables. Additionally, types declared at the database level with CREATE TYPE have properties not shared by their PL/SQL package equivalents, such as their ability to interact with SQL. How much of this information you need to encode in each identifier depends on the situation, but don‘t feel you have to use a different naming convention for each one. For example you may find it useful to distinguish between address_aatt, address_vatt and address_ntt, but more often the simpler address_tt will do fine.

Collection

A collection is declared based on a table TYPE statement, as indicated above. Treat it as a normal variable, but put the name in the plural to indicate that it holds (potentially) more than one value:

v_new_accounts account_tt;
v_old_accounts account_tt;

PL/SQL Record type

When you create records with a structure you specify (rather than from a table or cursor), you must declare the type. Use a rectype suffix declaration as follows:

ScopeTypePrimary IdentifierModifierSuffixExample
Global   Account   rectype g_account_rectype
Local   Account   rectype account_rectype
type g_address_rectype is record (...);  -- Formal
type address_rectype is record (...);    -- Informal
type address is record (...);            -- Chilled

Note, however:

  • In practice you would generally hope that an old account will have the same record structure as a new account, and a mailing address type will be the same as an invoice address, in that they are accounts and addresses.
  • It may also not be particularly relevant whether it is local or global. It may actually aid the program for this fact to be transparent.
  • The fact that it is a type may be perfectly clear from the way it is used.
  • Therefore, although the formal definition above should be used when it aids clarity, there is often a strong case for saying that the new datatype you have defined is an ADDRESS or an ACCOUNT, and should simply take its place alongside INTEGER and DATE.

PL/SQL record instance

Once you have defined a record type, you can declare actual records with that structure. Now you can drop the type part of the rectype suffix; the naming convention for these programmer-defined records is the same as that for records based on tables and cursors.

PL/SQL object instance

Arguably, you could prefix PL/SQL object names with o_. However, in practice it usually makes most sense to use the v_ prefix we use for variables, e.g. v_load_timer()

Schema-level object type

A type specification is an SQL object; the body, if present, may be implemented in PL/SQL, Java etc. Although they are superficially similar to PL/SQL record types, they incorporate advanced features such as constructors and member functions. For object types I use _OT. (An exception can perhaps be made for truly global types such as .)

A consideration with SQL object types is that the namespace is the entire schema, rather than just one PL/SQL package, procedure etc. This makes it especially important to think carefully about what the type represents and how it will be reused. Within a PL/SQL package you might get away with defining and naming a record type in whatever way suits the code you are writing, but at the schema level you should try to maintain a clean set of clearly defined types that are available for use throughout your application.

Schema-level collection type

Similarly to object types, you should translate any program-specific type requirement into something generic and reusable. As a starting point, the following should prove useful in any application:

create or replace type number_tt as table of number(38,10)
/
create or replace type integer_tt as table of integer
/
create or replace type date_tt as table of date
/
create or replace type varchar2_tt as table of varchar2(4000)
/
create or replace type timestamp_tt as table of timestamp
/

(See also , above, for notes on PL/SQL collection types.)

Code format

How you format your code in your source code is an intensely personal issue. Many people use conventions they have seen in text books or picked up from existing code, sometimes without thinking very hard about it. They can end up using a mishmash of techniques that makes the resulting code hard to read. So it is important that every programmer applies a consistent and (above all) logical coding style that is easy to read and maintain.

There is only one fundamental reason for formatting your code: to show the logical structure of your program. It is not about how it looks. Writing code to look nice is a waste of time, and misses the real point anyway. By being clear, consistent and logical you will produce a layout that tells it like it is, good or bad. Format mechanically and let the design show itself. You want to see that design. If the structure is bad, you want it to stare you in the face so that you can improve it.

Indentation and Alignment

Indentation is one of the most effective ways to display a program’s logical structure. When the blocks of code line up, you can immediately start to see where IF and LOOP constructs start and end, where a block begins, where the exception handlers are, and a hundred other things.

  • Use spaces (not the tab character) for indentation of all PL/SQL and SQL code.
  • Use 3 spaces for each nesting level.
  • Left-align blocks to create a clear vertical line at the block level. (SQL statements will often break this with their own subqueries etc, but you can still maintain a left-alignment layout.)
  • Do not feel obliged to align statements, operators etc. vertically. For example, the following is unnecessary, though as ever you should use it where it aids clarity:
    v_account_id           accounts.account_id%type;
    V_customer_id          customers.customer_id%type;
    v_success              boolean := false;
    K_error_code  constant pls_integer := -20010;
  • For a function or procedure declaration, start the first parameter specification on a new line.
  • Place one parameter specification on each line.

    This should give parameter specifications like the following:

    function invoice_address
        ( p_account_id accounts.account_id%type
        , p_date date )
        return address;
  • Note that this approach aligns ALL procedures and functions with each other, regardless of the length of their names, their parameters, or their return types:
    procedure do_stuff_rather_long_name
        ( p_first_parameter boolean default false
        , p_second_parameter service_agreement.agreement_status%type default 2
        , p_third_parameter currency_amount default 0 );

The following code sample illustrates indentation. (Ignore the code itself - it was something I worked on 10 years ago, and now with hindsight it looks a bit of a hack.)

create or replace function ncnc
   ( p_ctn varchar2 )
   return varchar2
as
   v_result varchar2(11);

   cursor c_mask is
      select new.*
           , new.cm_ctn_prefix || substr(p_ctn, old.cm_ctn_prefix_len +1) new_ctn 
      from   ctn_mask old 
           , ctn_mask new 
      where  old.cm_ctn_prefix = substr(p_ctn, 1, old.cm_ctn_prefix_len) 
      and    (    (    new.cm_ctn_prefix =
                       substr(old.cm_ctn_prefix,1,1) || ‘7‘ || substr(old.cm_ctn_prefix,2)
                   and new.cm_ctn_prefix_len = 5 )
              or  (    old.cm_ctn_prefix like ‘07___‘ 
                   and new.cm_ctn_prefix = old.cm_ctn_prefix ) ); 
   r_mask c_mask%rowtype;
begin 
   if p_ctn is not null then
      open c_mask;
      fetch c_mask into r_mask;
      close c_mask;

      v_result := r_mask.new_ctn; 
   end if;

   return v_result;
end ncnc;

Using case to aid readability

Unlike many other computer languages, PL/SQL identifiers are not case-sensitive. It inherits this from SQL, since the table and column names it must work with are also not case-sensitive (unless explicitly stated using double-quotes). While this is convenient and forgiving to the programmer, it can also allow us to write confusing code, since fireAllEmployees can also be written as fireallemployees or FIREALLEMPLOYEES. Note also that even if you code a procedure name as fireAllEmployees, it will still show up in data dictionary listings as it really is, i.e. FIREALLEMPLOYEES. It therefore greatly assists readability if you use case consistently. This may mean checking your editor‘s settings so that (for example) autocompletion doesn‘t leave you with uppercased table and column names while ones you typed yourself are lowercase. Frequently-used case conventions for PL/SQL are:

  • Uppercase keywords, lowercase everything else.
  • CamelCase PL/SQL identifiers, lowercase everything else.
  • Lowercase everything.

Despite personally using the first approach (uppercase keywords) for many years, I have now defected to "lowercase everything", because except in rare cases (the MODEL clause, with its own set of unfamilar keywords such as cv and iteration_number, might be one such example) endlessly uppercasing keywords didn‘t add to readability, and in keyword-heavy PL/SQL code it actually detracts from it, as long stretches of uppercase text are generally hard to read. (The convention arose before PL/SQL, when all you had were SQL keywords, table and column names.)

Whatever convention you adopt, the main point here is to make a reasonable attempt at consistency, and care about neatness and readability.

Formatting simple assignments

  • Use at most one statement per line.
    v_firstname := ‘Lynne‘; v_lastname := ‘Truss‘; -- This is technically valid.
    v_firstname := ‘Lynne‘;                        -- But code this
    v_lastname  := ‘Truss‘;                        -- way instead.
  • Always include a space between an identifier and a operator.
    v_firstname:=‘Lynne‘;                          -- This is technically valid.
    v_firstname := ‘Lynne‘;                        -- But code this way instead.

Formatting multi-line statements

Here are some recommendations:

  • Use indentation to offset all continuation lines under the first line. This is the most important guideline. By indenting the continuation lines using the block indentation and spacing described earlier, it is clear that they “belong” to the first line.
  • Place each parameter on its own line, aligned vertically. (The following is an example of formal layout, for when greater clarity is needed. If the procedure name and parameters all fit readably on one line, then fine.)
    upload_sales
    ( p_company_id
    , p_last_year_date
    , p_rollup_type
    , p_total );

Commenting style

You can avoid a lot of comments if you write self-documenting code. When you do enter comments, keep them short and to the point.

Comment as you code

Commenting as you go leads to better continuity as you code, as well as programs with fewer bugs. By commenting, you are trying to convey to someone else what the program is doing. This always clarifies things in your mind.

Explain why, not how

Avoid documenting the obvious. If your comments only repeat what is going on in the code, then they are just wasting space. Most programmers will understand the basic elements of the language. What is useful is to explain why you are doing something. For example do not insult the intelligence of the next programmer by commenting the exception handling block as (real example) /* Cope with any errors here: */

Make comments easy to enter and maintain

Avoid the tendency to make things look pretty. It takes too long to create as well as to maintain. When you change a comment you should not have to reformat all the lines in the comment. For example, instead of the following indulgence in text art:

          /*********************************************************/
         / +-----------------------------------------------------+ /
        / /                                                     / /
       / /  Main section:                                      / /
      / /  Added for version 1.23.4 WR 01/02/03               / /
     / /  Delete all the customers and fire the employees.   / /
    / /                                                     / /
   / +-----------------------------------------------------+ /
  /*********************************************************/

just keep it to:

-- Delete all the customers and fire the employees:

Use ‘--’ style comments where possible rather than /* ... */. This makes it easier to comment out whole sections later on when debugging, and is cleaner as it does not require a closing tag.

"When I‘m working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong."

- Richard Buckminster Fuller

Maintain Indentation

Comments should reinforce indentation and therefore the logical structure of the program. Always indent the comments at the same level as the code which they describe.

PL/SQL programming and layout guidelines

Now that naming standards are defined, here are some general guidelines for good programming practices. Many of them are universal and would apply to any programming language, even though we are only considering PL/SQL and SQL here.

OUT parameters (and why I don‘t like them)

A function can only return one variable, so if you need to return two or more values (perhaps forgetting about records, object types and collections), multiple OUT parameters can seem like the obvious solution. Sometimes there is a requirement (or at least a tradition) for a subprogram to provide a return code and message as OUT parameters so that the calling program can explicitly access the returned status information without having to parse SQLERRM. Now this may be a real issue with external tools that cannot handle PL/SQL functions or exceptions, but in general it is not such a great solution, because:

  • The ‘Return code‘ approach abandons the standard exception handling model. When you call the procedure, do you trust that all possible exceptions will be converted into codes and handled, or do you (as a good defensive programmer) write an exception handler anyway? If so, didn‘t you just waste your time interpreting its return code? The procedure always technically ‘succeeds‘ in that it returns without error, and so we always have to read its return code and test whether it was greater than zero or whatever the rule is.
  • OUT parameters complicate your procedure call. For example,

    mypackage.do_stuff(v_account, v_invoice_id, v_order_priority, v_retcode, v_error_message);

    Now anyone maintaining the code has to figure out which of those arguments are IN and affect processing, OUT and pass back information, or, heaven help them, IN OUT. Note that there is no immediate way to tell which is which. In the above example, v_retcode and v_error_message are reasonably clear, but what about v_invoice_id? Maybe the procedure generates an invoice and passes back its key. To check this, you will have to find the procedure‘s declaration and match each parameter against the corresponding item. If you ever have to maintain a poorly-written legacy system, you will quickly find yourself cursing OUT parameters.

  • The equivalent function is more flexible as it can be called from SQL queries (if it returns SQL types). SQL queries cannot use procedures (although SQL can call procedures using the statement.)

If you do have to provide separate return status information, you should at least consider defining a standard record type, e.g:

type status_rectype is record
( code     pls_integer
, message  varchar2(1000) );

v_return_status status_rectype;

Now you can call procedures using:

mypackage.do_stuff(v_account, v_invoice_id, v_order_priority, v_return_status);

Of course, OUT parameters have their uses and we are not recommending a complete ban; just that you should consider other options first.

Conditional Statements

Layout

  • Do not add redundant bracketting. Notice that PL/SQL provides the THEN keyword to terminate conditional expressions. For example:
    • if (x = 1)
    • if ((x = 1) and (y = 2))
    • if x = 1 and y = 2
  • The THEN keyword can follow on the same line when the condition is short and simple, or (more formally) it can start a new line directly below the if.
  • Similarly, where you have more than one condition to test, if there are only two or three and they are very simple you can place them all on one line. If there are more or they require more space, write one per line, aligned vertically:
    • if to_char(sysdate,‘D‘) > 1 and mypackage.myprocedure(1,2,3,4) not between 1 and 99 then
    • if to_char(sysdate,‘D‘) > 1
      and mypackage.myprocedure(1,2,3,4) not between 1 and 99
      then
    • if to_char(sysdate,‘D‘) > 1
      and mypackage.myprocedure(1,2,3,4) not between 1 and 99
      then
  • Where you have a mixture of OR and AND conditions, unless they are very simple (such as one-word boolean variables) use spaces to align bracketed conditions vertically:
    • if a = 1 and (b = 2 or c = 3 or d = 4) then
    • if a = 1
      and ( b = 2
      or c = 3
      or d = 4 )
      then

Usage

Ensure Conditions in ELSIF Clauses are Exclusive

The implication of ELSIF clauses is that if one condition is fulfilled, all others would fail - they are mutually exclusive. The following IF statement is a classic misuse of ELSIF clauses. It might not cause any errors, but that would just be a matter of luck. In many cases, the exclusivity is less obviously determined.

  • if sal between 0 and 10000
    then
        ...
    elsif sal between 10000 and 20000
    then
        ...
    elsif sal between 20000 and 30000
    then
        ...
    else
        ...
    end if;
  • if sal < 10000
    then
        ...
    elsif sal < 20000
    then
        ...
    elsif sal < 30000
    then
        ...
    else
        ...
    end if;
Use Boolean Elements to Improve Readability

Boolean variables and functions can greatly improve readability of programs. You can hide complex expressions behind a name which describes the expression.

Compare the two if statements below.

  • if total_sal between 10000 and 50000
    and emp_status(emp_rec.empno) = ‘N‘
    and months_between(emp_rec.hiredate, sysdate) > 10
    then
        give_raise(emp_rec.empno);
    end if;
  • eligible_for_raise :=
        total_sal between 10000 and 50000
        and emp_status(emp_rec.empno) = ‘N‘
        and months_between(emp_rec.hiredate, sysdate) > 10;
    
    if eligible_for_raise
    then
        give_raise(emp_rec.empno);
    end if;

On its own, this particular example isn‘t so great, but you can see that the approach could be very helpful in situations where various conditions are tested repeatedly in different combinations.

Avoid IF when assigning values to Boolean Variables

Although it is valid to assign explicit TRUE and FALSE values to a Boolean, using the test expression itself as a Boolean expression is much more elegant, and saves a processing step as well as several lines of code:

  • if hiredate < sysdate
    then
        date_in_past := true;
    else
        date_in_past := false;
    end if;
  • date_in_past := hiredate < sysdate;

Summary

Here is an example of some code exhibiting many of the features discussed above:

if (shipdate < add_months (sysdate, +3) or
    order_date >= add_months (sysdate, -2)) and
    cust_priority_type =‘high‘ and
    order_status = ‘O‘
then
    ship_order(‘EXpreSS‘);

elsif (order_date >= add_months (sysdate, -2) or
    add_months (sysdate, 3) > shipdate) and
    order_status = ‘O‘
then
    ship_order(‘STANDARD‘);
end if;
  • Although they have commonsense names and are in a consistent case, none of the variables have standard prefixes.
  • cust_priority_type is ambiguous on first reading, because it looks like a type definition.
  • The and and or operators, which introduce new conditions similar to bullet points, are for some reason stuck on the end of the previous conditions. This is madness.
  • The arguments EXpreSS and STANDARD are hardcoded. They should be constants defined at a higher level (or in a shared library package).
  • Note however that the bracketing is actually required, as the conditions contain a mixture of ORs and ANDs. It just looks redundant because the code is unformatted. It should be:

    if (   v_shipdate < add_months(sysdate, +3)
        or v_order_date >= add_months(sysdate, -2) )
    and v_cust_priority = ‘HIGH‘
    and v_order_status = ‘O‘
    then
        ship_order(k_ship_express);
    
    elsif (   v_order_date >= add_months(sysdate, -2)
           or add_months(sysdate, 3) > v_shipdate)
    and v_order_status = ‘O‘
    then
        ship_order(k_ship_standard); 
    end if;
  • We could go further and criticise the duplicated call to ship_order(). If it ever needed more parameters or pre-processing of values, they would all have to be coded in both places. It would probably be better to set a variable such as v_shipping_type to either k_ship_express or k_ship_standard within the IF statement, and use it in a single call to ship_order() at the end.

Loops

Some general guidelines:

  • Do not EXIT or RETURN out of a FOR or WHILE loop.
  • A FOR loop should only be used when you want to execute the body a fixed number of times. Stay for the duration or use a different loop construct.
  • The loop should be terminated only when the condition in the boundary evaluates to FALSE.
  • The Cursor FOR loop is nearly always preferable to the longwinded and slower OPEN-FETCH-EXIT-CLOSE. As a rule you should only consider OPEN-FETCH-EXIT-CLOSE when:
    • You are working with a cursor variable. The compiler can‘t automatically declare a record of cursor%ROWTYPE for weak ref cursors (since the return type is unknown until runtime), although you would think it could manage with strongly-typed ones. Currently (Oracle 9.2) it does not.
    • You want to retain the record after the loop completes. However since a single-row query is better handled with select into (simpler, faster, easier to follow), that only leaves the case where you need to loop through a number of records and only retain the final one, which must be a fairly unusual scenario.
  • Anonymous Cursor FOR loops (e.g. for r in (select cols from sometable) loop are fine. Feuerstein & Pribyll warn against putting SQL in ‘unexpected places‘ within your code, but surely right next to the processing is a pretty good place. The only downsides are:
    • You can‘t reuse the cursor.
    • You can‘t base record types on the cursor‘s %ROWTYPE.
    • Since the cursor is anonymous, you can‘t refer to its attributes. If you subsequently need its %ROWCOUNT, you will have to go back and re-code it as a Cursor FOR loop.

Note: if an exception is raised and the loop stops, that is a legitimate “early termination”.

Do not use PL/SQL where you can use a SQL statement instead.

The SQL statement will nearly always be much faster. You should replace PL/SQL loops with single SQL statements when possible.

  • Slower PL/SQL version:
    for i_year in 1..20
    loop
        insert into table1 -- Twenty INSERT statements passed to the SQL engine
        select *
        from   table2
        where  starting_year = i_year;
    end loop;
  • Faster, simpler SQL version:
    insert into table1
    select *
    from   v1table2
    where  starting_year between 1 and 20;

Constants

  • Remove all “magic numbers” and other literals (within reason) from your code. Instead, declare constants to hold those literal values. This aids maintenance by self-documenting (the name of the constant will suggest what it stands for) and by ensuring consistency in the event of the value changing.
  • Allow the value of that literal (now a constant) to be set in only one place in your code.
  • If you find that you have written a program in which a variable‘s value does not change, you should first determine whether that behavior is correct. If it is, you should then convert that variable to a constant.
  • If you do convert a variable to a constant, you should also change its name. This will help to remind anyone reading the code that your identifier refers to a constant and cannot be changed.

Why should you bother converting unchanging variables to constants? Because when you tell it like it is, the program explains itself more clearly. The declaration of a named identifier as a constant gives you information about how it should be used in the program. It can also be very useful (and a great relief) for someone debugging changes later on to know that a particular value is permanently fixed, and therefore does not need painstaking tracking in case it does.

Avoid recycling variables

Each variable you declare should have one purpose and one purpose only. The name for that variable should describe, as clearly as possible, that single-minded purpose. Using it for two different things in the course of the program just introduces a whole load of hidden dependencies and potential confusion, purely to save you typing a couple more lines of code. This is one of those things that seems like a short cut until you find yourself paying for it later.

Name subtypes to self-document code

One of the most compelling reasons for creating your own subtypes is to provide application- or function-specific datatypes that automatically document your code. A programmer-defined subtype hides the generic "computer datatype" and replaces it with a datatype that has meaning in your own environment. In naming your subtype, you should therefore avoid references to the underlying datatype and concentrate instead on the business use of the subtype.

Suppose you are building a hotel reservation system. A very useful subtype might be a room number: it is a special kind of NUMBER that is used throughout the application. So define a subtype called ROOM_NUMBER, which you can then use whenever you need to define a variable referring to a room number.

subtype room_number is rooms.room_number%type;
...
-- A declaration using the subtype:
v_open_room room_number;

To make it easiest for individual developers to be aware of and make use of standard variable declarations, consider creating a package that contains only standard subtypes, constants and variable declarations and any code necessary to initialize them.

Remove unused variables from programs

You should go through your programs and remove any part of your code that is no longer used. This is a relatively straightforward process for variables and named constants. Simply execute searches for a variable‘s name in that variable‘s scope (copying and pasting the relevant section to another editor window can make this easier). If you find that the only place it appears is its declaration, delete the declaration and, by doing so, delete one more potential question mark from your code.

There is never be a better time to review all the steps you took and understand the reasons you took them than immediately upon completion of your program. If you wait, you will find it particularly difficult to remember those parts of the program which were needed at one point, but were rendered unnecessary in the end. When someone else has to modify the code in six months‘ time, it will be a lot harder to trace these dead variables and take them out, so their job will be that much harder than it needs to be.

Use %TYPE whenever possible

Use %TYPE when a variable represents a column

Always use the %TYPE attribute to declare variables which are actually PL/SQL representations of database values. This includes a lot of your variables. Using %TYPE sometimes takes more typing, but it improves your code substantially.

Suppose you have a procedure that formats information about a customer. You need to declare a variable for each attribute of the customer: first name, last name, address, National Insurance number, etc. Declare them using %TYPE as follows:

procedure format_customer
    ( p_customer_id  customers.cst_id%type )
is
    v_first_name     customers.cst_first_name%type;
    v_last_name      customers.cst_last_name%type;
    v_address        customers.cst_address_l1%type;
    v_city           customers.cst_city%type;
    v_national_ins#  customers.cst_national_ins_number%type;
begin
    ...
end;

Using the %TYPE Attribute ensures that your variables stay synchronized with your database structure. Just as importantly, though, this declaration section is more self documenting now. The %TYPE attribute provides important information to anyone reviewing the code, stating: "These variables represent my columns in the program. When you see one of these variables, think ‘database column‘." This correlation makes it easier to understand the code, easier to change the code, and easier to recognize when one of those variables is used in an inappropriate manner. If the code ran:

procedure format_customer
    ( p_customer_id  integer )
is
    v_first_name     varchar2(30);
    v_last_name      varchar2(30);
    v_address        varchar2(500);
    v_city           varchar2(30);
    v_national_ins#  varchar2(9);

it would be less clear what each of the variables was intended to hold and what its relationship was to the CUSTOMER data.

Use %TYPE to standardise non-database declarations

While many of your local PL/SQL variables are directly related to database columns, at least some will be local-only, perhaps calculated values. You can use the %TYPE attribute to infer a variable‘s datatype from another, previously defined PL/SQL variable. For example:

v_revenue number(20,2) default 0;
v_total_revenue v_revenue%type;

The variable v_revenue acts as the standard variable for revenue data. Whenever we declare the v_total_revenue variable (or any other revenue-related variables), we can base it on v_revenue. By doing this, we guarantee a consistent declaration of revenue variables. Furthermore, if the revenue datatypes ever need to be changed again, we only have to change the way that v_revenue is declared, and recompile. All variables declared with v_revenue%TYPE will automatically adjust. (We could also have defined a subtype to achieve this.)

Note that v_revenue’s DEFAULT 0 is not inherited by v_total_revenue.

Exceptions

Named exceptions

Be careful when using raise exception_name, particularly a programmer-defined exception such as raise invalid_account, because there is no way to supply an explanatory accompanying error message. At least with Oracle-defined exceptions you can look up the code in a manual. With the above example all anyone has to go on is that an ‘account‘ was in some sense ‘invalid‘. Generic ones such as fatal_error are even less helpful. As a rule therefore, you should avoid raising programmer-defined exceptions, but only test for them in when clauses. For example, compare the following:

  • raise errorpkg.fatal_error;
  • raise_application_error
    ( errorpkg.k_fatal_error
    , ‘Credit check failed for account ‘ || r_acc.acc_id
    , true );

Both examples make use of centrally-defined exceptions, and both will return the same error code. However, the second approach provides a diagnostic explanation of what has gone wrong, appending a contextual comment to the existing Oracle error stack, while the first will just give the default "User-defined exception".

Or consider the following:

  • error_pkg.log_error(‘Invalid account ‘ || r_acc.acc_id);
    raise errorpkg.fatal_error;
  • v_error_text := ‘Invalid account ‘ || r_acc.acc_id;
    errorpkg.log_error(v_error_text); raise_application_error(errorpkg.k_invalid_account, v_error_txt );
  • v_error_text := ‘Invalid account ‘ || r_acc.acc_id;
    error_pkg.log_and_raise(errorpkg.k_invalid_account, v_error_text);

The first example makes an effort to log information about the error, but then raises a generic "User-defined exception" error to the calling procedure, leaving the poor support team to search the error log for clues.

The second takes care to use the same error text in both logging and reporting the error. Note that we do not explicity raise invalid_account by its name (nearly always a bad idea), but by using the error code constant associated with it in ERRORPKG we achieve the same thing, as well as defining an informative error message which will appear in SQLERRM.

The final example hides the whole logging-and-raising logic behind the application‘s LOG_AND_RAISE procedure, simplifying the code and leaving less room for inconsistency. (Note however that since we are creating error stacks, we normally only need to log the error at the topmost level, and so log-and-continue procedures will generally be unnecessary.)

Error codes

Also worth noting is the fact that with RAISE_APPLICATION_ERROR, the actual error code need not be particularly important. If you are not planning to publish a complete list of error codes for your application, it might be enough to have one code per package (or even one code for the whole application). This certainly simplifies things, as you can then declare a package-level constant such as k_error_code constant pls_integer := -20042, and then use k_error_code in all calls to RAISE_APPLICATION_ERROR.

Less is more

With the range of options available for handling exceptions, it can be tempting to try to use all of them, only to find that you are worse off than when you started. For example, you might define a whole range of application-specific exceptions such as invalid_account, raise them conditionally within procedures and have the calling procedure test for them in its exception handler, as text book examples often seem to suggest you should. But why should the calling procedure care why the procedure it called failed? It just failed. The details are in SQLERRM. Any required clean-up steps surely belong within the procedure that failed, not in every piece of code that has the misfortune to call it. In most cases the calling procedure can simply report something like Could not generate invoice for order 4231, and add this to the error stack using RAISE_APPLICATION_ERROR.

SQL Layout Guidelines

Left-align as for any other language

This really is the single most important point in laying out code, and is a test of whether a programmer really understands the point of layout at all. We do not do it to create arbitrary text art effects. Code layout and formatting is based on the idea of left-alignment of blocks being used to show that statements form part of the same command or set of commands with a common governing condition, with indentation levels indicating dependency relationships. We have defined this standard for PL/SQL (as for shell scripts, Java or any other programming language) and we should stick with it.

To depart from this system arbitrarily for one type of statement while observing it for others surely makes no sense, and the current fashion for doing so within SQL statements is to be deplored.

The following type of layout is a fashion statement and is not acceptable as code formatting:

  • select last_name, first_name
      from employees
     where department_id = 15
       and hire_date < sysdate;
  • update employees
       set hire_date = sysdate
     where hire_date is null
       and termination_date is null;

Instead, use:

  • select last_name, first_name
    from   employees
    where  department_id = 15
    and    hire_date < sysdate;
  • insert into employees
    ( emp_id
    , emp_firstname
    , emp_lastname )
    values
    ( emp_seq.nextval
    , r_emp.firstname
    , r_emp.lastname );
  • update employees
    set    salary = salary * v_raise_factor
    where  department_id = v_department_id
    and    termination_date is null;

Give your code some space

  • Use a separate line for each expression in a select list.
  • Place each table in a from clause on its own line.
  • Place each expression in WHERE clause on its own line.
    select last_name
         , c.name
         , max(sh.salary) best_salary_ever
    from   employees e
         , companies c
         , salary_history sh
    where  e.company_id = c.company_id
    and    e.employee_id =sh.employee_id
    and    e.hire_date > add_months(sysdate, -60);
    update employees
    set    hire_date = sysdate
         , termination_date = null
    where  department_id = 105;

Don’t overdo the spacing

  • Use no more than one blank line within code.
  • Use two blank lines between subprograms in a package or type body.

Use meaningful abbreviations for table and column aliases

Instead of arbitrary labels such as:

select cols
from   employees a
     , companies b
     , profiles c
     , sales d
where  b.com_id = a.emp_com_id
and    c.pro_com_id = b.com_id
and    d.sal_com_id (+)= c.pro_com_id

Name the table aliases on the underlying table name (they don’t strictly have to be 3 letters in length):

select cols
from   employees emp
     , companies com
     , profiles pro
     , sales sal
where  com.com_id = emp.emp_com_id
and    pro.pro_com_id = com.com_id
and    sal.sal_com_id (+)= pro.pro_com_id

Note also how the joins are arranged so that they flow through the tables in the same order as FROM list, joining consistently from left to right (including outer joins). Of course SQL accepts joins specified either way around and the optimizer may choose a different join order, but demonstrating the intended logical flow within the WHERE clause makes it easier to understand the intended relationships, and also to check that none have been left out.

Some more SQL layout examples

select da.
     , da.object_type
     , da.owner
     , decode(aa.object_name,
              null, ‘No‘, ‘Yes‘ ) granted
     , s.synonym_name
from   all_objects aa
     , dba_objects da
     , all_synonyms s
where  (    da.object_name like upper(‘&object‘)
        or  (    da.object_name like upper(substr(‘&object‘,instr(‘&object‘,‘.‘)+1))
             and da.owner = upper(substr(‘&object‘,1,instr(‘&object‘,‘.‘)-1)) )
       )
and    aa.object_name (+)= da.object_name
and    aa.object_type (+)= da.object_type
and    aa.owner (+)= da.owner
and    s.table_name (+)= da.object_name
and    s.table_owner (+)= da.owner
order by  da.                            

热门排行

今日推荐

热门手游