Programming Distributed Collaboration Interaction Through the World Wide Web
University of Calgary, Department of Computer Science
MSc Thesis © Roberto A. Flores June, 1997
CHAPTER 3 The Java Programming Language
The Java Programming Language
In 1985, researchers at Sun Microsystems were working on an innovative windowing system called NeWS, which stands for Networked Extensible Window System. This system was implemented as a distributed system with client computers running lightweight processes that communicate with server applications using messages (Gosling, Rosenthal and Arden, 1989). Typically, client processes perform as receivers of user events that are translated into commands and transmitted to server processes. In response, server processes return programs for the client to execute. These programs can perform operations on the display and receive events from the keyboard and the mouse, thus, repeating the interaction process. Programs downloaded from servers are created to use the PostScript programming language (Adobe, 1985), which is an interpreted language capable to provide portability of code between different computers and operating systems.
NeWS never caught enough market share to succeed and the project was canceled at the beginning of the 1990's, resulting on the merging of team members into other projects. One of these projects, which had the mandate to develop software for consumer electronics, was the originator of the Java language. When C++ proved not to be suitable for the task assigned to that project, the Java programming language was created. Even if NeWS is not strictly a predecessor of Java, the experience gained from the development of NeWS may have helped to shape the features of the Java programming language.
A programming language is usually characterized by its main features. Java is depicted as an object-oriented, distributed, secure, multi-threaded and portable programming language. These characteristics are detailed on the following sections.
Java is an object-oriented programming language. For programmers, this means that they will need to focus on the application data and on the methods needed to manipulate that data, rather than concentrating on functions and procedures. Java was designed with features found on previously developed object oriented languages. However, it manages to strike a balance between pure object oriented models, such as SmallTalk (Goldberg and Robson, 1989), and non-object oriented models, such as C.
Booch (1994) described several features required for a programming language to be labeled as object oriented:
- Abstraction: "An abstraction denotes the essential characteristics of an object that distinguish it from all other kinds of objects and thus, provide crispy defined conceptual boundaries, relative to the perspective of the viewer." (Pg. 41)
The parameters relevant to this feature are the availability of instance variables and methods, and class variables and methods. The main difference between these two groups is that instance variables and methods can only be used through an instance object for the class, while class variables and methods can be used directly by specifying the qualifier of the class. Additionally, the values of instance variables are exclusive to each instance, while the value assigned to a class variable is shared by all the instances of that class. All of these features are supported by the Java language.
- Encapsulation: "Encapsulation is the process of hiding all the details of an object that do not contribute to its essential characteristics." (Pg. 50)
In the case of Java, several levels of hiding can be used for variables and methods. These levels of encapsulation are linked to the following modifiers:
If no reserved word is used, then the access is granted to invocations made from within the class of declaration and from classes belonging to the same package.
- public: no access restrictions.
- private: access is granted to invocations from inside the class.
- protected: access is allowed to invocations from within the class, from other classes belonging to the same package, and from subclasses of the declaring class.
- private protected: access is forbidden to invocations that do not belong to the class of declaration or its subclasses.
- Modularity: "Modularity is the property of a system that has been decomposed into a set of cohesive and loosely coupled modules." (Pg. 57)
Java uses two levels of modularity: by classes (where each class is a container of variables and methods), and by packages (which is the grouping of classes into logical units).
- Hierarchy: "Hierarchy is a ranking or ordering of abstractions." (Pg. 59)
Java's object model is based on a single-inheritance class hierarchy, having the class Object as the root class. This means that all classes have just one immediate parent, and that the class Object is a super class for all the classes. Even when the class hierarchy is based on single-inheritance, multiple-inheritance is allowed by the use of interfaces. The idea of interfaces is a concept borrowed from the protocols found on ObjectiveC. Interfaces are essentially abstract classes that declare, but do not implement, methods, which are to be implemented on inherited classes. Variables declared on interfaces are handled as class variables for all classes using that interface.
Java is a distributed programming language. It supports both the TCP/IP (Transmission Control Protocol/Internet Protocol) and the UDP (User Datagram Protocol) families. From these protocols, TCP/IP is used for reliable stream-based communications, and UDP to support fast point-to-point datagram-oriented models. Java networking classes also include classes to handle Internet addresses and to download the contents of resources associated with a URL.
Java has the characteristic of being portable, or more accurately said, programs produced on Java can be executed on any computer where a Java Virtual Machine is implemented. This capability of being portable is based on Java's platform neutrality and interpreted nature.
Java's cornerstone to allow portability is based on a proprietary set of intermediate instructions called bytecode, which are used to conform all Java programs. Bytecode are sequences of bytes representing instructions for the Java Virtual Machine, which is a simulated CPU implemented on Java interpreters. In practice, when an interpreter loads a program, each byte is evaluated in software, performing changes on the state of the virtual CPU to reflect the changing state of execution on the program.
Additional characteristics that support portability in Java are the abstraction of primitive data types and graphical user interfaces. In the case of primitive data types, as shown in Table 3, they are specified to be of a fixed size regardless of the operating system of execution.
||true or false
||IEEE 754 floating-point
||IEEE 754 floating-point
Table 3. Java Primitive Data Types.
To handle user interfaces, Java designers developed an abstract windowing library that acts as a wrapper for native widgets found on major graphical environments. This way, the use of components is unified under a single set of classes, which are independent of the platform of execution. Figure 4 shows the class hierarchy for widgets available on the core release of Java. These components represent just a portion of the entire abstract windowing library.
Figure 4. Component classes from the Abstract Window Toolkit library.
Classes found on Java allow the creation of buttons, canvases, checkboxes, radio buttons, labels, list boxes, combo boxes, scroll bars, input lines, input areas, windows, panels, dialogs, windows, and the practical file dialog to select disk files.
Java is intended to be a secure language. Security is an important concern, since Java is targeted to networking environments. Based on the premise that no downloaded program is to be trusted, Java implements several security mechanisms to protect users against malicious code.
When compiled, Java source code is checked for compliance with the memory allocation and reference model. Under this model, declarations for direct access to memory addresses are not allowed. Additionally, memory layout decisions are not made at compilation time. Instead, compilers will generate handles that will be resolved to real memory addresses at runtime, preventing programmers to hack into systems using such addresses.
Even though the use of Java compilers ensures that source code will behave according to safety rules, interpreters do not have the means to check that any downloaded bytecode was produced by a well-behaved compiler. To trust downloaded code, interpreters will subject programs to verification through a series of tests. These tests range from simple verification of the format on instructions to validating the code through a simple theorem verifier. Once the verification process is done, interpreters can proceed to execution knowing that the code will run securely. For detailed information on the verification process please refer to the Java Security section back to Chapter 2.
Unfortunately, the verification process in Java is not as secure as it is claimed, since it fails to have formal semantics and a formal description of the type system. This circumstance makes it impossible to formally prove the correctness of the runtime verifier (Dean, Felten and Wallach, 1996). As a result, the verification process can not be proven correct since its exact behavior for every possible set of bytecode is uncertain.
Java is a multi-threaded language. It provides support for multiple lightweight processes within a program. The main problem with writing multi-threaded programs resides on making methods safe to be accessed by multiple concurrent threads. This task usually implies the management of locks to control and synchronize access to resources.
Java supports pre-emptive multi-threading at the language level and through the support of the runtime system and thread objects. Multi-threading is supported at the language level by using locks -- or monitors -- for synchronization. Every class and variable has a lock that can be used for this purpose. For example, methods within a class that are declared synchronized do not run concurrently. This behavior is automatically enforced by granting the class lock to the first thread entering a synchronized method. The lock will be released by the thread when exiting the method or when put to sleep. Support for threads at the class level is provided by the Thread class, which implements methods to start, stop and handle threads, and the Runnable interface, which provides the abstraction required for an instances of a class to be treated as a thread.
3.2 PROGRAMMING FOR THE INTERNET AND THE WORLD WIDE WEB.
Rather than creating new HTML extensions, Java made popular the notion of downloadable programs that can run inside Web browsers. The alpha release version of Java, back in 1995, included a Web browser called HotJava. This browser allowed normal Web navigation plus the ability to execute Java applets hyperlinked to HTML documents. Shortly thereafter, Netscape announced its intention to license Java to integrate it to its second version of its market-leading Navigator browser.
HotJava and Netscape Navigator are not the only browsers that support Java applets, but they were first in order of appearance and current market share, respectively. Both these browsers have promoted the use of Java as a programming language for the Web. However, HotJava and Netscape Navigator have followed different patterns of development and, up to the day of writing this thesis, HotJava has only reached beta release status while Navigator is at the brink of version 4. Due to its wide availability and advanced state of development, Netscape Navigator will be chosen for further studies on the integration of Java to the Web.
In order to evaluate the suitability of Java as a programming language for the Web, two characteristics have to be observed: first, the level of integration with browsers, and second, the availability of tools to perform distributed operations.
3.2.1 INTEGRATION WITH NETSCAPE NAVIGATOR.
function handleEvent(id, value1, value2, value3)
<OPTION SELECTED> Node - Rectangle
<OPTION> Node - Rounded Rectangle
<OPTION> Node - Ellipse
<OPTION> Line - Binary
<OPTION> Line - Trinary
<OPTION> Line - Quadrary
<OPTION> Context Box
<INPUT TYPE="BUTTON" VALUE="New"
Public void init()
JSObject win = JSObject.getWindow(this);
public boolean mouseUp(Event e, int x, int y)
3.2.2 INTERNET NETWORKING.
The java.net package included as part of Java provides the infrastructure needed to achieve networking operations. The basic protocols to deal with the Internet are implemented in a few classes that encapsulate their functionality without involving the programmer with low-level networking details. Classes included in this package allow one to represent Internet addresses, to access resources referenced by URLs, to perform low-level networking using datagrams, and to communicate using stream sockets.
Basic classes required for networking operations are URL and InetAddress. These classes, and their importance to initialize other classes, are explained as follows:
- URL: The URL class implements Internet Resource Locators. It provides the most basic interface to perform networking operations, since resources referred by a URL can be downloaded using a single method invocation. URL is also used to initialize objects of the URLConnection class. This class provides additional methods than those provided by the URL class to perform complex manipulation of Internet resources. For example, using URLConnection objects it is possible to obtain information about the resource pointed, its content type, length, and date of last modification. Additionally, if the protocol used supports write operations, then methods implemented in this class can allow overwriting the content of a resource pointed to by a URL.
- InetAddress: This class supports Internet addresses, and is used when performing networking operations using sockets and datagrams. The InetAddress class does not have a public constructor method, but supports static factory methods to create new instances. Such instances can contain the address of the local host or the address of a host specified by name. The InetAddress class is used to initialize socket and datagram communications, which are explained as follows:
- Datagrams: UDP datagrams are fire-and-forget packets of information that are passed over the network. They provide fast communication. The tradeoff is that they are not guaranteed to reach their destination, and if they do, separate datagrams may not even arrive in the order they were sent. However, when optimal performance is required and the overhead of doing custom verification is justified, datagrams are a valuable mechanism to have available. Classes used for datagram communication are DatagramPacket (data container class) and DatagramSocket (datagram packet sender and receiver class).
- Sockets: TCP/IP sockets implement reliable bi-directional point-to-point, stream-based connections between hosts on the Internet. A common model for network communication is to have one or more clients sending requests to a single server program. In such cases the server uses an instance of the ServerSocket class to accept connections from clients. When a client reaches the port on which the server is listening, the server allocates a new Socket object in a new port for subsequent communication between server and client. After allocating the new connection, the server returns to the listen mode for receiving additional client connections.
3.3 LANGUAGE COMPARISON BETWEEN JAVA AND C++.
Java is a language that borrows much of its terminology and syntax from C++. However, Java is considered a simpler language than C++, since a number of C++ features have been removed from the Java implementation. In certain ways, this reduction allows programmers, familiar with C++, to easily climb the learning curve. Java eliminates some C++ redundancies and non-object-oriented characteristics maintained as legacy from C.
A number of main differences exist between Java and C++. These differences, which range from slight modifications to complete removal of features, are described as follows:
- No header files: Header files are considered of great benefit for data hiding, since they allow one to declare the prototypes for classes in a readable format while having the actual implementation in a binary file for distribution. On the other hand, the existence of header files creates inconveniences such the maintenance required to keep the consistency between header file declarations and the source file implementation. Java has eliminated header files, and it maintains all the information about a class inside the class implementation.
- No preprocessor: Java does not include any kind of preprocessor. One of the jobs of a preprocessor is to search for special commands that begin with a hash mark "#". These commands perform conditional compilation and macro replacement. It may seem hard for C++ developers to program without #define or #ifdef, but Java can make do without these constructs. In the case of #define, Java relies on the final keyword to achieve some of its functionality. Additionally, the import statement has similar characteristics to the #include command, and #ifdef commands can be partially simulated by using compilers that optimize blocks of code delimited by boolean expressions that have static values (e.g., if (false)).
- No global functions or global variables: In Java, methods and variables are declared within classes. Likewise, every class is part of a package, resulting on all methods and variables to have fully qualified names. These names are formed using the package name, the class name and the variable or method name. By having static variables and methods it is possible to simulate global functions and variables, but it is not possible to have name conflicts due to the naming convention previously described.
- No goto statement: Java does not implement the goto statement and thus, it eliminates the main instrument of the so called "spaghetti code." However, the keywords break and continue cover some important and legitimate uses of goto on looping structures. Furthermore, Java's well-defined exception handling compensates for the absence of this statement.
- No operator overloading: Method overloading is a technique that allows the declaration of several methods with the same name but with different list arguments. Operator overloading is a similar technique, but it allows symbols to be declared as methods to perform operations according to the type of the parameters involved. Up to the present version, Java allows method overloading but it does not allow operator overloading.
- No structures, unions, typedefs, bitfields, enumerated types or variable-length argument lists: Java does not support the struct and union types found in C++; however, structures can be simulated using classes without methods. Additionally, Java neither supports typedefs (to define new aliases for type names) nor bitfields (which can be used to interface hardware devices, for example). Java does not allow one to define methods that take a variable number of arguments. Method overloading and arrays can act as replacements for simple cases of variable-length argument lists. The absence of enumerated types is a missing feature that may be seen as unusual, since Java has the characteristic of being strongly typed. However, this circumstance may be the result of a design decision to maintain simplicity on the types handled by the language.
- No const parameter qualifier: In C++, when a parameter is specified with the const qualifier, the compiler makes sure that the value assigned to that variable will remain unaltered during its scope. As a result, methods receiving a variable passed as a const parameter are not allowed to make modifications to its value. In Java, there is no automatic mechanism to perform this operation.
- No templates: C++ templates are type-parameterized classes or functions. Template based class libraries are not just type safe but also enhance reuse of structures for different type formats. Templates are also of relevance on the context of containers. Since object-based (non-template) containers do not have the mechanisms to enforce certain object type for their elements, there is no way to ensure that a container actually holds objects of the expected type.
- Characters are Unicode characters: In Java, values of type char are not signed. Additionally, characters and strings are composed of 16-bit Unicode characters, allowing easy internationalization of programs that do not use the Latin alphabet. The Unicode character set is composed of more that 34,000 distinct code characters, where the first 256 are ASCII compatible.
- Arrays and Strings are objects: Arrays and strings behave just as regular objects: they are manipulated by reference, they can be dynamically created with new, and they are automatically garbage collected when no longer needed. However, they are special in the sense that they can be manipulated differently than objects. As shown on Figure 7, arrays and strings can be directly initialized by specifying their value. In the case of strings, concatenation can be achieved by placing addition symbols between string variables and constants.
Figure 7. Code example on initialization of strings and arrays.
String subject = "John Doe";
String aliases = ["Steven Sagan", "Vitto Corleone"];
String message = subject + " is also known as " + aliases;
- null is a reserved keyword and boolean is a primitive data type: In Java, null is a value that indicates an absence of reference. Unlike C++, where NULL is just a constant defined to be 0, null in Java is a reserved word that has no value and can not be assigned to primitive data types. On the other hand, boolean is defined as a primitive data type that can be assigned a value of true or false. In contrast to C++, boolean values are not integers; they can not be treated as integers, and may never be cast to or from any other type.
- Primitive data types are fixed in size and sign, and can not be cast to objects, or viceversa: As previously seen in Table 3, boolean, char, byte, short, int, long, float and double are primitive data types available in Java. These variables are always fixed in sign and size, unlike C++ where an integer may be 16, 32 or 64 bits, and characters may be signed or unsigned depending on the operating system of execution. Additionally, Java does not allow conversions between primitive data types and object references, as in C++ (e.g., casting an integer to a pointer).
- Parameter-passing is always by value: There are two techniques to pass parameters to C++ functions: call by value and call by reference. When passing variables to a function using "call by value" a copy of the original data is passed. This circumstance allows modifications on the copy without altering the value of the original variable. On the other hand, when passing a variable using "call by reference" an alias is created for the variable itself. This alias represents the memory address where the variable is located. Under this technique, modifications on the variable passed to the function will result on modifications of the original value as well. In the case of Java, variables are always passed to methods by value. For primitive data types, this assertion means that an independent copy of the original value is passed. In the case of handles to objects, copies of the handles are submitted. This circumstance allows the modification of the object referenced by the original handle, and does not allow the modification of the handle (this behavior is achieved in C++ by using the const modifier on pointers and references passed to functions). Java has no mechanisms to modify the original value of arguments from within methods, whether it is a handle or a primitive data type. One way to modify the content of a variable when submitted to a method as an argument is by assigning the return value of the method to that variable upon return.
- Threads and synchronization are part of the core language: As previously explained on the Multi-threaded section early on this chapter, synchronization in Java is an intrinsic part of the language. Synchronization is achieved by the use and enforcement of locks, which prevent multiple threads from simultaneously accessing critical sections of code. The Thread class encapsulates all the information about a single thread of control running on the Java interpreter. This type of support for threads and synchronization of threads is a feature that it is not implemented as part of the C++ language.
- Automatic memory management: Objects in Java are created on the heap using the new keyword. However, there is no delete keyword to dispose of them, as in C++. This is because Java implements a memory manager to handle all references to the heap and disposes of objects that are not longer referenced or used in a program. The disposing of objects and the freeing of memory is performed using a process called garbage collection. The garbage collector process runs on a low-priority thread whenever the system is idling, or when a request for memory allocation fails to find enough free memory to satisfy such request. The concept of automatic memory management is foreign to C++. In C++, programmers have the responsibility to remember when and where to dispose of allocated objects. It is worth mentioning that garbage collection processes will never be as efficient as explicit, well-written memory allocation and deallocation routines written by programmers. However, it does make programming easier and less prone to errors.
- Single inheritance on classes, multiple inheritance with interfaces: C++ allows classes to have more than one superclass, using a technique known as multiple inheritance. This technique allows class designers to mix various attributes from different branches of a class hierarchy. Java does not implement multiple class inheritance, but implements multiple interface inheritance. Interfaces are just like classes, but they are not allowed either to declare instance variables, or implement methods.
3.4 CHAPTER SUMMARY.
This chapter covered the fundamental aspects of the Java language, which was described as an object-oriented, distributed, portable, secure and multi-threaded programming language. After discussing these characteristics, Java was scrutinized to find its suitability as a programming language for the Web.
Additionally, Java networking classes were explained. Classes that handle Internet addresses and Uniform Resource Locators were introduced as the basis to support datagram and socket communications, and URL connections.
This chapter concluded with a section detailing the main differences between Java and C++. In its role as the most popular programming language, C++ is compared with Java with respect to aspects ranging from the structure of primitive data types to templates and automatic memory management.
Chapter 4 is an overview of the Java concept mapping tool implemented as a test case for this research. This chapter will show previous concept mapping tool developments, as well as the motivation underlying such developments. The system architecture for the test case, which has been named jKSImapper, is further discussed in Chapter 5. That chapter will also discuss lessons learned when porting previously developed C++ classes to Java.
© Roberto A. Flores June, 1997