SYSC 5800: Network Computing

Sample Solution to Final Exam, Winter 2006

 

Question 1. Sockets (10 marks)

1.        What is the difference between a streaming socket and a datagram socket?

 

Answer (5 marks):

Stream sockets treat communications as a continuous stream of characters, while datagram sockets have to read entire messages at once. Each uses its own communications protocol. Stream sockets use TCP (Transmission Control Protocol), which is a reliable, stream oriented protocol, and datagram sockets use UDP (Unix Datagram Protocol), which is unreliable and message oriented. A streaming socket requires the establishment of a virtual connection with state information about the connection. TCP packets also have a slightly bigger transport layer header (20 byte TCP header). A datagram socket requires no connection establishment/termination overhead and the per-packet transport layer overhead is smaller (8 byte UDP header).

 

2.        Which type of sockets would you choose for the following applications (assume that the corresponding server/partner in the communication would choose the same socket type)? Explain you choice.

a.        FTP client

b.       VoIP client

c.        DNS lookup

d.       Videoconference application:

                                                   i.      The control channel

                                                  ii.      The data (i.e., video and voice) channels

 

Answer (5 marks):

·         FTP client: use TCP (streaming socket) to get the reliable data delivery, as you want to transfer files, and even having a single undiscovered error and reordering of packets will corrupt the transferred file.

·         VoIP client: in-order delivery is nice, but not essential. Similarly, reliable transmission of data sounds advantageous at first, as this would increase voice quality. But TCP achieves reliability by asking the sender to retransmit lost packets. For VoIP, late packets are as unacceptable as lost packets. So to keep the network overhead lower, choose a datagram socket.

·         DNS lookup: is in essence a data application, so reliable, in-order delivery would be nice. But DNS lookups are small and therefore fit into a single packet. And the client expects a reply from the server, so lost packets can be dealt with by retransmitting the query at the application layer. In addition, DNS queries are typically sent to a DNS server in the local network, so the chances of loosing a packet/transmission are small. To reduce the overhead and avoiding setting up and tearing down connections repeatedly, choose a datagram socket.

·         Videoconference application: choose a streaming socket for the control channel, which is long-lived and requires reliable transmission. Choose datagram sockets for the streaming data channels (video and voice) for the same reasons as discussed under VoIP.

 


Question 2. RMI/Corba (10 marks)

1.        Describe how RMI and Corba are similar to and different from RPC.

 

Answer (5 marks):

Similar:

·         higher-level programming abstraction, no need to deal with sockets directly

·         support by tools to generate some of the networking-related code automatically, driven by some sort of Interface Definition file (well, in most cases, not true for XML-RPC and dynamic invocation in Corba)

·         based on a request-response style communication pattern: invoke a remote component, passing parameters, and then wait for a reply containing the result(s) of the invocation

 

Differences:

·         RPC based on procedural programming language concepts (procedure), RMI/Corba based on object-oriented programming concepts (objects/interfaces)

 

2.        What are the differences between RMI and Corba?

 

Answer (5 marks):

·         RMI: single programming language (Java), Corba designed explicitly for multi-platform and multi-language environment. For example, the RMI interface definition is based on Java’s interface syntax, Corba provides own, independent IDL to specify interfaces which then need to be mapped to specific implementation languages

·         RMI: invoked component is by design a Java object (instance of a Java class that implements an RMI interface), in Corba the IDL could hide/abstract many different ways to implement the invokee (could be a legacy application), with the instantiation etc. managed by the Object Adapter

·         RMI provides distributed garbage collection, Corba does not

·         RMI focuses on basic communication mechanism, Corba provides basic communication mechanism plus lots of standardized services to make application development easier

·         Both use interface definitions to automatically generate code for static remote method invocations, but Corba also allows dynamic invocations (defer discovery of remote objects and their interfaces until runtime).

·         Slight differences in the capabilities of the name server: in Jini, objects have to register with a local nameserver, using a flat name space, Corba provides a system-wide nameserver with a hierarchical namespace.

 


Question 3. XML (10 marks)

1.        Why did the world need another markup language (other than HTML) and what are some of the key features of XML?

 

Answer (4 marks):

HTML is a very presentation-oriented markup language. As such it is good for preparing documents that are to be displayed to a human reader on a PC-style device. There have been other attempts at defining presentation-oriented markup languages, such as HDML for smartphones (cellphones with a microbrowser), which take into consideration the unique device limitations of such devices.

 

Ultimately, to allow documents to be exchanged/interpreted/…. by software components communicationg with each other (for example, when using XML-RPC), a different sort of markup language is required, however. Rather than marking up the content with presentation attributes, a markup language that provides information about he content is needed. In the example used in class, it is one thing to declare that book titles should be displayed in italic fonts. It is a completely separate issue to actually define that the text string “blabla” is a book title.

 

XML does not attempt to define markup tags for every imaginable content (a rather hopeless enterprise). Instead, XML defines general syntax rules (which ensure that an XML document is well-formed: single root element, proper nesting of elements). In addition, the validity of an XML document is defined by a separate document, either as DTD or XML scheme. In these documents, the allowed elements/tags, their sequencing, hierarchical relationship, and possible attributes are specified.

 

2.        What are the relative advantages and disadvantages of using SAX and DOM for parsing XML documents?

 

Answer (6 marks):

         SAX

         event-driven

         serial-access

         element-by-element processing

         DOM

         creates a tree structure of objects

         stores it in memory

         easier to navigate, but more memory needed

 

 


Question 4. Web Services (10 marks)

What are the main standards that enable Web Service? What is their role/how do they interact?

 

Answer (10 marks):

 

l       XML (Extensible Markup Language) - For data representations

l       SOAP [a.k.a. Simple Object Access Protocol, XML Protocol] - An XML packaging protocol, defines message formats and encoding rules

l       WSDL (Web Services Description Language) - An XML service description language

l       UDDI (Universal Description, Discovery, and Integration) - For discovering service providers if you do not know their URLs

 


Question 5. P2P Computing (10 marks)

Explain what a distributed hash table is and how such distributed hash tables form the foundation of modern P2P systems.

 

Answer (10 marks):

Distributed hash table:

          Distributed version of a hash table data structure

           Stores (key, value) pairs

Ø        The key is like a filename

Ø        The value can be file contents

           Goal: Efficiently insert/lookup/delete (key, value) pairs

           Each peer stores a subset of (key, value) pairs in the system

           Core operation: Find node responsible for a key

Ø        Map key to node

Ø        Efficiently route insert/lookup/delete request to this node

 

Desirable properties:

Ø       Keys mapped evenly to all nodes in the network

Ø        Each node maintains information about only a few other nodes

Ø        Efficient routing of messages to nodes

Ø        Node insertion/deletion only affects a few nodes

 

Use in P2P systems: ultimately, P2P systems should avoid the use of any centralized entity (server, directory, etc.). If content is fully distributed, need efficient operations to look up/find specific entities, and distributed hash tables are the core technology to do this. Alternatively, P2P systems would have to resort to flooding, which would not scale well (but may work reasonably well for smaller systems).