Safe-C Programming Language
Tutorial for developers who already know C
Here's a Safe-C program :
// date.c
from std use calendar, console;
void main()
{
DATE_TIME now;
get_datetime (out now);
printf ("We are the %02d/%02d/%04d ", now.day, now.month, now.year);
printf ("and it is %02d:%02d:%02d.\n", now.hour, now.min, now.sec);
}
|
Compilation
As in C, a Safe-C program uses the file extensions .h
for the interfaces and .c for the program bodies.
The "make" is integrated in the compiler :
when compiling you need only give to the compiler the name of the main .c file
and it will follow automatically the include path of libraries
(from std use xxx;) or of local files (use yyy;).
Compilation units
The .h and .c files of a component must always be stored in the same folder
so that the compiler can find them. It's not necessary to import the .h file into the
corresponding .c file, the compiler will do it automatically.
Example:
// data.h
float global_delta = 1.0; // public variable
const int MAX = 100; // public constant
void insert (int element); // public function
|
// data.c
int i = 0;
int table[MAX];
public void insert (int element) // public function body
{
table[i++] = element;
}
|
Contrarily to C, we don't use the keywords static and extern.
Variables declared in .h files need no longer be declared a second time in the .c file.
Instead of using the keyword static for functions that are intern to a file
we will use the keyword public to declare functions that are visible outside a file.
Note that the keyword public is never used in .h files because everything in it is public anyway.
Here's an example of use of our component 'data' :
// main.c
from std use console;
use util/data; // component data is stored in the sub-folder "util"
void main()
{
global_delta = 2.0;
printf ("MAX = %d\n", MAX);
insert (1);
insert (element => 2); // call with explicit parameter name
data.insert (3); // call with prefixed component name
}
|
Initialisation of variables
All local variables must be initialized at their first use, including
arrays and structures that must receive an initial full value.
This can done with the instruction clear which replaces
C's "memset (&v, 0x00, sizeof(v));", or by assigning a complete aggregate like :
void main()
{
int tab[3];
clear tab; // all elements to 0
tab = {all => 5}; // all elements to 5
tab = {5, 6, 7}; // assignment of a full aggregate
}
|
Structures must likewise be initialized :
void main()
{
struct KEY
{
int nr;
char c;
}
KEY k;
clear k; // all elements to 0
k = {1, 'a'}; // simple aggregate
k = {nr=>1, c=>'a'}; // aggregate with names
}
|
Data Types
Here's a fast overview of all data types :
- signed integers : int1, int2, int4, int8
and their aliases: tiny, short, int, long.
- unsigned integers : uint1, uint2, uint4
and their aliases: byte, ushort and uint.
- enumeration types: char, wchar, bool.
- floating types: float and double.
- array, struct, union, safe pointer(^), unsafe pointer(*), pointer to function, opaque, generic.
Arrays are declared like in C, with however a small difference :
void main()
{
char t1[10], t2[10];
char[10] t1, t2;
}
|
The two declaration lines above are identical because what's specified
on the left with the type applies to all identifiers on the right.
It is allowed to combine both syntaxes.
You can declare an array type of unspecified length, for example string is predefined as :
hence the following four declarations are identical :
void main()
{
char[100] buffer1;
string(100) buffer2;
char buffer3[100];
string buffer4(100);
}
|
Parameters Modes
There are 3 parameter modes:
- in (simple types are passed by value, arrays and struct by address)
- ref (by address)
- out (by address also)
Parameters of mode 'in' are read-only, they cannot be assigned a new value.
Parameters of mode 'ref' have no restrictions.
Parameters of mode 'out' are considered as non-initialized variables,
they must receive a full value before the function ends.
void foo (int i, ref int j, out int k)
{
k = i + j;
}
|
A function call looks like this, by repeating the mode :
void main ()
{
int i, j, k;
i = 1;
j = 2;
foo (i, ref j, out k);
}
|
So you can see that, contrarily to C, you don't use any & or * symbols.
Arrays are passed like this :
void foo1 (char[10] str);
void foo2 (char[] str);
void foo3 (string str);
|
Function foo1 accepts only arrays of char of length 10 :
at execution time, only an address is passed on the stack.
Function foo2 accepts arrays of char of any length :
at execution time, the address and the length are passed on the stack,
so that the array length can be queried and checked within the function.
Function foo3 is equivalent to foo2 but more pleasant to read.
Attributes
The attribute 'length allows you to take the length of any array :
void main()
{
char tab[3];
int i;
i = tab'length; // 3
|
Attributes 'min and 'max allow you to take the minimum/maximum of an integer type :
void main()
{
int i, petit, grand;
petit = i'min; // -2_147_483_648
grand = i'max; // +2_147_483_647
}
|
Attributes 'first and 'last allow you to take the first/last value of an enumeration type :
void main()
{
enum COLOR {RED, GREEN, BLUE};
COLOR a, b;
a = COLOR'first; // RED
b = a'last; // BLUE
}
|
Attribute 'string allow you to convert an enumeration value into a string representing its literal,
which is useful when you pass these types in printf when debugging ...
void main()
{
enum COLOR {RED, GREEN, BLUE};
COLOR c = RED;
printf ("c = %s\n", c'string);
printf ("first color = %s\n", COLOR'first'string);
}
|
Slices
A slice of an array is like a slice of bread. It has a beginning and a length :
string(5) s;
string(2) t;
s = "Hello";
t = s[3:2]; // copies "lo" (start=3, length=2)
s[1:4] = "ELLO"; // keeps the H but changes the rest
|
Array indexes and slices are checked and generate a fatal error in case of illegal values.
Strings
The component 'strings' contains well-know functions : strcpy, strcat, sprintf, etc ..
It is noteworthy that, contrarily to C, the ending nul character is optional.
So if you use strcpy() to copy "Hello" into a string of length 5,
there will be no ending nul character.
from std use strings;
void main()
{
string(64) str, str2;
int i = 2, j = 3, len;
sprintf (out str, "value of i is : %d", i);
sprintf (out str2, " and j equals : %d", j);
strcat (ref str, str2);
len = strlen (str);
}
|
If you do a strcpy of a string of length 6 into a table of length 5, you will get a fatal error.
Constants
The following C declarations :
#define MAX 100
#define TITLE "programme.c"
|
will be written as follows in Safe-C :
const int MAX = 100;
const string TITLE = "programme.c";
|
Jagged arrays
The following C declaration :
char *table[] = {"This", "is", "an", "example"};
|
will be written in Safe-C as :
const string table[] = {"This", "is", "an", "example"};
|
You can obtain the number of strings using table'length.
structures
Structures are declared almost like in C :
struct PERSON
{
char[20] name;
int age;
}
PERSON per;
|
Furthermore, there exist special structures featuring a 'discriminant' of type enumeration :
enum TypeShape {POINT, SQUARE, CIRCLE, TRIANGLE};
struct Shape (TypeShape kind)
{
int x, y;
switch (kind)
{
case POINT:
null;
case SQUARE:
int side;
case CIRCLE:
int radius;
case TRIANGLE:
int base, height;
}
}
Shape(SQUARE) s = {x=>1, y=>2, side=>3};
|
The size of the structure depends on the discriminant value when the variable is created.
The compiler does never allocate the maximum length but only the length for the given variant.
Variant structures can be passed as parameters :
void foo1 (Shape(POINT) p)
{
// ...
}
void foo2 (Shape s)
{
switch (s.kind)
{
case POINT:
// ...
break;
}
}
|
Function foo1 will accept only a Shape of type POINT,
whereas foo2 will accept any variant.
foo2 receives the discriminant s.kind in a hidden parameter
so it knows which variant it is.
Packed types
packed struct PERSON
{
char[20] name;
int age;
}
PERSON per;
|
The keyword packed tells the compiler not to align the fields of the structure.
Consequently the structure becomes 'portable' and can be passed through an input/output function
to the outside world (file, network, ..).
A packed structure cannot contain a pointer ^ (otherwise you could read a random value from
the outside world into a pointer and corrupt memory).
There's a rule that implicitely converts all packed types into byte arrays
when passing them as parameter.
read() et write() being declared like this :
int read (int fd, out byte[] buffer);
int write (int fd, byte[] buffer);
|
you can thus write this :
rc = read (fd, out per);
// or
rc = write (fd, per);
|
Moreover, any packed variable can be converted into a byte array using the attribute 'byte :
byte tab[4];
float f = 1.2;
tab = f'byte; // copies 4 bytes
|
which allow copying the content of any variable to any other variable
(unless the variable contains an unsafe pointer^, those being excluded from these conversions) :
int i;
float f = 1.2;
i'byte = f'byte; // copies 4 bytes
i'byte[0] = f'byte[0]; // copies the first byte
i'byte[2:2] = f'byte[2:2]; // copies the last 2 bytes
|
The type object
The type object is predefined as a byte array :
typedef byte[] object; // open array of byte
|
The type object[] is used in the declaration of functions having a variable number of
parameters, like these :
int sprintf (out string buffer, string format, object[] arg);
int sscanf (string buffer, string format, out object[] arg);
|
In the body of these functions, you can know the number of parameters
by using arg'length, and each parameter is accessible through arg[i] and has type
array of byte. Depending on the string 'format', it is then possible for the function
to convert them to the desired type using the attribute 'byte.
References
A reference allows you to rename a variable into a shorter name.
In practice, the reference always stores the variable's address, and sometimes its length for an array.
ref string s = p^.line[i]^;
printf ("%s\n", s);
|
Pointer types
A pointer is declared using the symbol ^. The keyword new allows you to allocate dynamic-size
variables on the heap, with or without specifying an initial value.
There are simple types :
int^ p = new int; // object initialized to zero
int^ p2 = new int ' (1); // explicit initialization to 1.
|
or
struct NODE
{
int nr;
NODE^ next;
}
NODE^ p = new NODE; // object initialized to zero
NODE^ p2 = new NODE ' {1, null}; // explicit initialization using aggregate
NODE^ p3 = new NODE ' (p^); // initialized with value of another object
|
There are two types of array objects : those with constant length (that have a constant specified in the
pointer declaration) :
int[3]^ p = new int[3]; // always points to an array of length 3.
int[3]^ q = new int ' {1, 2, 3}; // same
|
and those with dynamic length (that have no length specified in the pointer declaration) :
int[]^ p = new int[3];
int[]^ q = new int ' {1, 2, 3};
|
please watch out for the difference : the last ones have a length field stored in the header of the heap object;
they are not compatible with the first ones.
Last, there are structure with discriminant containing the value of the discriminant in a header of the heap object:
Shape^ p = new Shape(POINT);
Shape^ q = new Shape(POINT) ' {x=>1, y=>2};
Shape^ r = new Shape ' (q^);
|
Implementation of pointer types
A pointer type ^ is secured by a 'tombstone' mecanism: each pointer
points to an intern structure called Tombstone that contains the address
of the real object allocated on the heap as well as a counter of references.
This mecanism, handled in a thread-safe way, prevents any operation that could corrupt memory.
If the systeme is short on memory during a new
the program will stop on a fatal error exactly as when your stack is full following too many
recursive calls.
It's up to you to manage your memory consumption.
As in C, each memory block allocated with new must be freed
after use with the instruction free.
Using free on an object still referenced by any thread
or already freed earlier by free will cause a fatal error.
On the other hand, the language will not notify you if you forget to call free,
because that doesn't corrupt memory.
Pointers to functions
Pointers to functions exist like in C, without surprise.
Here's an example :
void treat_node (Shape s); // function declaration
typedef void TREAT (Shape s); // function pointer type
void treatment ()
{
TREAT treat; // function pointer variable
treat = null;
treat = treat_node; // parameter modes and types must match
if (treat != null)
treat (s);
}
|
Unsafe pointers
To interface the libraries with the operating system, the old C pointers are available in Safe-C,
for example the operator & can be used to take an object's address,
an unsafe pointer can be indexed as in p[i], or taken a field of as in p->field,
also the operators ++ and -- operate on unsafe pointers.
All this is however only available in an unsafe section :
#begin unsafe
const string filename = "Test\0";
char *p = &filename;
p++;
#end unsafe
|
Threads
The operator run allows you to start a thread very easily.
void my_thread ()
{
}
void main()
{
int rc;
rc = run my_thread (); // starts a thread (rc: 0=OK, -1=error)
}
|
The function my_thread can have a maximum of one parameter.
Opaque types
Opaque types allow a very simple form of class in which the fields of a structure are only available
in the .c file corresponding to the .h file where the opaque type is declared.
Furthermore, all operations allowing to take a copy (clone) of the opaque type are disallowed.
// drawing.h
struct DRAW_CONTEXT; // opaque type
void init (out DRAW_CONTEXT d);
void circle (ref DRAW_CONTEXT d, int x, int y, int radius);
|
// drawing.c
struct DRAW_CONTEXT // full struct type
{
int x, y, dx, dy;
IMAGE^ image;
}
public void init (out DRAW_CONTEXT d)
{
// ..
}
public void circle (ref DRAW_CONTEXT d, int x, int y, int radius)
{
// ..
}
|
// main.c
use drawing;
void main()
{
DRAW_CONTEXT a, b;
init (out a);
b = a; // ERROR : assignment not allowed for limited types
}
|
generic packages
Safe-C allows to declare generic packages so you can write algorithms that can be instantiated
for a given type. This has the same effect as C's macros, except that the compiler
doesn't just replace mecanically the generic by the actual type; all the package
is syntactically checked.
Also, non-generic packages can be declared, as well as nested packages.
Here's an example of a bubble sort instantiated for the type int :
// bubble.h
generic <ELEMENT> // generic type ELEMENT
int compare (ELEMENT a,
ELEMENT b); // return -1 if a<b, 0 if a==b, +1 if a>b
package BubbleSort
void sort (ref ELEMENT table[]);
end BubbleSort;
|
// bubble.c
package body BubbleSort
public void sort (ref ELEMENT table[])
{
int i, j;
ELEMENT temp;
for (i=1; i<table'length; i++)
{
for (j=i; j>0; j--)
{
if (compare (table[j-1], table[j]) <= 0)
break;
temp = table[j-1];
table[j-1] = table[j];
table[j] = temp;
}
}
}
end BubbleSort;
|
int compare_int (int a, int b)
{
if (a < b) return -1;
if (a > b) return +1;
return 0;
}
package Sort_int = new BubbleSort (ELEMENT => int,
compare => compare_int);
void main()
{
int table[5] = {2, 19, 3, 9, 4};
sort (ref table); // must be written Sort_int.sort if ambiguous
}
|
A few more things
To close this chapter, here are some other short infos:
- the type wchar on 16-bits means Safe-C supports UTF-16 so you can write chinese or japanese characters,
and this works even in the source code, so you can write constants japanese strings L"".
- the instruction assert b; can be used to check an assertion during compilation or at runtime;
- the instruction abort; stops the program on a fatal error;
- the instruction sleep n; allows to suspend a thread during some specified time. sleep takes an argument of type int or float in seconds.
- the instruction _unused v; allows to specify that a variable is unused, to avoid a compiler warning;
- in case of a fatal error, using the library unit 'exception' allows you to generate a file called
CRASH-REPORT.TXT that allows the programmer to locate the error.
That's it, you know now the most important parts of the Safe-C programming language !
All the rest (operators, instructions) should be familiar to you if you already know C.