Real-Time Communication Scheduling in a Multicomputer Video Server

Pages 20
Views 7

Please download to get full document.

View again

of 20
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Journal of Parallel and Distributed Computing 58, (1999) Article ID jpdc , available online at on Real-Time Communication Scheduling in a Multicomputer Video Server
Journal of Parallel and Distributed Computing 58, (1999) Article ID jpdc , available online at on Real-Time Communication Scheduling in a Multicomputer Video Server A. L. Narasimha Reddy Department of Electrical Engineering, Texas A H M University, 214 Zachry, College Station, Texas and Eli Upfal Department of Computer Science, Brown University, P.O. Box 1910, Providence, Rhode Island Received March 17, 1997; revised October 10, 1997; accepted April 27, 1999 In this paper, we address the problem of scheduling communication over the interconnection network of a distributed-memory multicomputer video server. We show that this problem is closely related to the problem of data distribution and movie scheduling in such a system. A solution is proposed in this paper that addresses these three issues at once. The movies are distributed evenly over all nodes of the multicomputer. The proposed solution minimizes the contention for links over the switch. The proposed solution makes movie scheduling very simpleif the first block of the movie is scheduled, the rest of the movie is automatically scheduled. Moreover, if the first block of the movie stream is scheduled without network contention, the proposed solution guarantees that there will be no network contention during the entire duration of playback of that movie. We show that the proposed approach to communication scheduling is optimal in utilizing the network resources. Extensive simulation results are presented to show the effectiveness of the proposed approach Academic Press 1. INTRODUCTION Several telephone companies and cable operators are planning to install large video servers that would serve video streams to customers over telephone lines or cable lines. These projects envision supporting several thousands of customers with the help of one or several large video servers. These projects aim to store movies in a compressed digital format and route the compressed movie to the home where it can be uncompressed and displayed. These projects aim to compete with the local Copyright 1999 by Academic Press All rights of reproduction in any form reserved. 426 REDDY AND UPFAL video rental stores with better service; offering the ability to watch any movie at any time (avoiding the situation of all the copies of the desired movie rented out already) and offering a wider selection of movies. Providing a wide selection of movies requires that a large number of movies be available in digital form. Currently, with MPEG-1 compression, a movie of roughly 90 minute duration takes about 1 GB worth of storage. A video server storing about 1000 movies (a typical video rental store carries more) would then have to spend about 8250, 000 just for storing the movies on disk at a cost of 80.25MB. This requirement of large amounts of storage implies that the service providers need to centralize the resources and provide service to a large number of customers to amortize costs. Hence, the requirement to build large video servers that would store a large number of movies in a single system and be able to service a large number of customers. Multicomputer systems may be suitable candidates for supporting such large amounts of real-time IO bandwidth required in these large video servers. Several problems need to be addressed for providing the required real-time IO bandwidth in such a multicomputer system. In this paper, we outline some of the problems and their solutions particular to a video server based on multicomputers. In this paper, we will use the term multicomputer system to describe a system that may be variously known as a multicomputer or a clusterd system without a single address space. 2. THE PROBLEM We will assume that the multicomputer video server is organized as shown in Fig. 1. A number of nodes act as storage nodes. Storage nodes are responsible for storing video data either in memory, disk, tape, or some other medium and delivering the required IO bandwidth to this data. The system also has network nodes. These network nodes are responsible for requesting appropriate data blocks from storage nodes and routing them to the customers. Both these functions can reside on the same multicomputer node, i.e., a node can be a storage node, a network node, or both at the same time. Each request stream would originate at one of the several network nodes in the system and this network node would be responsible for obtaining the required data for this stream from the various storage nodes in the system. To obtain high IO bandwidth, data has to be striped across a number of nodes. If a movie is completely stored on a single disk, the number of streams requesting that movie will be limited by the disk bandwidth. As shown earlier by [1], a 3.5-inch 2-GB IBM disk can support up to 20 streams. A popular movie may receive more than 20 requests over the length of the playback time of that movie. To enable serving a larger number of streams of a single movie, each movie has to be striped across a number of nodes. As we increase the number of nodes for striping, we increase the bandwidth for a single movie. If all the movies are striped across all the nodes, we also improve the load balancing across the system since every node in the system has to participate in providing access to each movie. Hence, we assume that all the movies are striped across all the nodes in the system. Even when the movies are stored in semiconductor memory, the required communication and memory bandwidths may require that the movie be striped across the REAL-TIME COMMUNICATION SCHEDULING 427 FIG. 1. System model of a multicomputer video server. memories of different nodes of the system. For the rest of the paper, we will assume that the movies are striped across all the storage nodes. The unit of striping across the storage nodes is called a block. In our earlier studies on disk scheduling [1], we found that Kbytes is a suitable disk block size for delivering high realtime bandwidth from the disk subsystem. As a result of this, a network node that is responsible for delivering a movie stream to the user may have to communicate with all storage nodes in the system during the playback of that movie. This results in a point to point communication from all storage nodes to the network node (possibly multiple times, depending on the striping block size, the number of nodes in the system, and the length of the movie) during the playback of the movie. Each network node will be responsible for a number of movie streams. Hence, the resulting communication pattern is random point-to-point communication among the nodes of the system. It is possible to achieve some locality by striping the movies among a small set of nodes and the restriction that network nodes for a movie be among this smaller set of storage nodes. We observe that the delivery of video data to the consumer requires three components of service: (1) reading of the request block from the disk to a buffer at the 428 REDDY AND UPFAL storage node, (2) transmission of the block from the storage node to the network node over the multicomputer network, and (3) transmission of the block from the network node to the consumer's desktop. Since the last component of service would depend on the delivery medium (telephone wires, cable, or LAN), we will not address that issue here and will limit our attention to the service that has to be provided by the video server system, components (1) and (2). A video server has to supply data blocks of the movie at regular periods to the consumer. If data is not transfered at regular intervals, the consumer may experience glitches in the delivery of the movie. To ensure glitch-free service, the video server has to guarantee finishing the three components of service in a fixed amount of time. Guaranteeing delay bounds in service component (1) is addressed by appropriate disk scheduling [14]. The problem of ensuring delay bounds in service component (2) is addressed in this paper. When multiple transmissions take place simultaneously over the network, the delays experienced by individual transmissions depend on the contention experienced in the network. Worst case assumptions of contention are typically made to obtain guaranteeable delay bounds in a network. Our approach to this problem is to carefully schedule individual tranmissions over the network so as to minimize (or eliminate) the contention in the network. This approach enables us to guarantee tighter delay bounds on transmissions over the multicomputer network of the video server. For the rest of the paper, we will assume that every node in the system is both a storage node and a network node at the same time, i.e., a combination node. We will use a multicomputer system with an Omega interconnection network as an example multicomputer system. Movie distribution (or data distributionorganization) is the problem of distributing the blocks of movies across the storage nodes. This involves the order in which the blocks are striped across the storage nodes. Data organization determines the bandwidth available to a movie, load balance across the storage nodes and the communication patterns observed in the network. Movie scheduling is the problem of scheduling a storage node and a network node such that the required blocks of a movie stream arrive at the network node in time. At any given point in time, a node can be involved in sending one block of data and receiving one block of data. Movie scheduling is concerned with scheduling the transfer of blocks of a movie between storage nodes and the network node for that movie stream. Communication scheduling is a direct consequence of the movie scheduling problem. When two transfers are scheduled to take place between two different sets of source and destination pairs, the communication may not happen simultaneously between these pairs because of contention in the network. Figure 2 shows a 16-node Omega network [5] built out of 4_4 switches. Figure 2 assumes unidirectional links going from left to right and, hence, each node has a sending port on one side and the receiving port on ther side in Fig. 2. We will assume that each node has a sending port and a receiving port and that a node can participate in a send and a recieve operation simultaneously. Communication cannot take place simultaneously between nodes 1 and 3 and nodes 9 and 2 in Fig. 2. Can movies be scheduled such that there is no contention at the source, at the destination and in the network? The communication scheduling problem deals with REAL-TIME COMMUNICATION SCHEDULING 429 FIG. 2. A 16-node Omega network. this issue of scheduling the network resources for minimizing the communication delays. This problem of scheduling communication over the multicomputer interconnection network is the focus of this paper. If the nodes in the multicomputer system are interconnected by a complete crossbar network, there is no communication scheduling problem since any pair of nodes in the system can communicate without a conflict in the network. The disk scheduling problem is dealt with at each node separately and we will assume that the system load is such that disk bandwidth is not a problem. Deadline scheduling [6] is known to be an optimal scheduling strategy when the tasks are preemptable with zero cost and the task completion times are known in advance. Deadline scheduling is shown to be optimal even when the tasks are not preemptable [7]. But, both these studies assume that the task completion times are known in advance. The block transfer time in the network is dependent on whether there is any contention for the switch links and this contention varies, based on the network load. Also, a network transfer requires multiple resources (links, input and output ports) unlike the assumption of requiring only one resource in these studies. Hence, these results cannot be directly applied to our problem. 430 REDDY AND UPFAL Recent work [2, 1, 3, 4, 8] has looked at disk scheduling in a video server. File systems for handling continuous media have been proposed [913]. Multicomputer based video servers are studied in [1416]. Related work in multicomputer communication includes estimation of delays in the network [1719], studies on network hotspots [20], and many application specific algorithms that exploit the network topology, for example, [21]. Other approaches to supporting multimedia network traffic include best-effort approaches [22, 23] Basic Approach 3. SCHEDULING PROCESS We will assume that time is divided into a number of slots. The length of a slot is roughly equal to the time taken to transfer a block of movie over the multicomputer network from a storage node to a network node (we will say more later on how to choose the size of a slot). Each storage node starts transferring a block to a network node at the beginning of a slot and this transfer is expected to finish by the end of the slot. It is not necessary for the transfer to finish strictly within the slot but for ease of presentation, we will assume that a block transfer completes within a slot. The time taken for the playback of a block is called a frame. The length of the frame depends on the block size and the stream rate. For a block size of 256 Kbytes and a stream rate of 200 Kbytess, the length of a frame equals =1.28 s. In this paper, we consider constant bit rate (CBR) streams that require a constant data rate. A block takes a slot for transfer from storage node to network node and takes a frame for playback at the client. Hence, a stream requires data transfer service across the multicomputer network for a slot in every frame. If we can provide guarantees that a block of data can be provided to the client every frame, with double buffering at the client, the client can playback the movie continuously. We will assume that the frame is an integral multiple (F) of the slot size. The slot size can be increased suitably to make this possible. Now consider scheduling the communication required by a single request stream. We will assume that a single network node will coordinate the delivery of this request stream to the client. Let us assume that the data required by this stream k in frame j is stored in storage node s k j. Then the data transfer required by this stream can be represented by a sequence of pairs of the form (n k, s k 0), (n k, s k 1), and so on, where each pair represents the (destination, source) pairs involved in the communication in each frame and n k is the network node involved in the delivery of this stream. For a single stream, the resulting traffic pattern is many (storage nodes) to one (network node). Each stream requires similar network service from the system. Hence, the network service for all the streams has a many to many traffic pattern. Data distribution among the storage nodes determines the exact traffic pattern. We will assume in this paper that a node can simultaneously participate in a send and receive operation. The schedule in the system can then be represented by a table as shown in Fig. 3. Each row in the table represents a network node, each column represents a time REAL-TIME COMMUNICATION SCHEDULING 431 slot and each entry in the table represents the individual stream getting service and the storage node transmitting the required block to the network node n k. Each stream's entries are separated by a frame (or F slots). If a stream has an entry in time slot j requiring service from storage node s i and network node n k, then that stream will have an entry in time slot j+f requiring service from storage node s i+1 and network node n k, where s i+1 is the storage node storing the next block of data for this stream. The order of storage nodes s i, s i+1 is determined by the data distribution. To avoid scheduling conflicts at the nodes, a storage node cannot be scheduled twice in the same column of this table since that represents two transmissions in the same time slot. Conflicts in the network can be avoided if the pairs of nodes involved in communication in a slot (the entries in a column of the table) can be connected by edge disjoint paths in the network. Now, the problem can be broken up into two pieces: (a) Can we find a data distribution that, given an assignment of (n k, s k j ) that is source and destination conflict-free, can produce a source and destination conflict-free schedule in slot j+f (service required in the next frame)? and (b) Can we find a data distribution that, given an assignment of (n k, s k j ) that is source, destination, and network conflict-free, produce a source, destination, and network conflict-free schedule in slot j+f? The second part of the problem, (b), depends on the network of the multicomputer and that is the only reason for addressing the problem in two stages. We will propose a general solution that addresses (a). We then tailor this solution to suit the multicomputer network to address the problem (b). FIG. 3. Characteristics of a schedule table. 432 REDDY AND UPFAL 3.2. Movie Scheduling and Data Distribution Assume all the movies are striped among the storage nodes starting at node 0 in the same pattern; i.e., block i of each movie is stored on a storage node given by i mod N, N being the number of nodes in the system. Then, a movie stream accesses storage nodes in a sequence once it is started at node 0. If we can start the movie stream, it implies that the source and the destination do not collide in that time slot. Since all the streams follow the same sequence of source nodes, when it is time to schedule the next block of a stream, all the streams scheduled in the current slot would request a block from the next storage node in the sequence and, hence, would not have any conflicts. In our notation, a set [n k, s k j ] in slot j is followed by a set [n k,(s k +1) mod N] in slot j+f in the next frame. It is clear that if j [nk, s k ]is j source and destination conflict-free, [n k,(s k j +1) mod N] is also source and destination conflict-free. Variants of such data distributions have been proposed and analyzed [11, 8]. This simple approach makes movie distribution and scheduling straightforward. However, it does not address the communication scheduling problem. Also, it has the following drawbacks: (i) not more than one movie can be started in any given slot. Since every movie stream has to start at storage node 0, node 0 becomes a serial bottleneck for starting movies; (ii) when short movie clips are played along with long movies, short clips increase the load on the first few nodes in the storage node sequence, resulting in nonuniform loads on the storage nodes; (iii) as a result of (i), the latency for starting a movie may be high if the request arrives at node 0 just before a long sequence of scheduled busy slots. The proposed solution uses one sequence of storage nodes for storing all movies. But, it does not stipulate that every movie start at node 0. We allow movies to be distributed across the storage nodes in the same sequence, but with different starting points. For example movie 0 can be distributed in the sequence of 0, 1, 2,..., N&1, movie 1 can be distributed in the sequence of 1, 2, 3,..., N&1, 0 and movie k (mod N) can be distributed in the sequence of k, k+1,..., N&1, 0,..., k&1. We can choose any such sequence of storage nodes, with different movies having different starting points in this sequence. When movies are distributed this way, we achieve the following benefits: (i) multiple movies can be started in a given slot. Since different movies have different starting nodes, two movie streams can be scheduled to start at their starting nodes in the same slot. (ii) Since different movies have different starting nodes, even when the system has short movie clips, all the nodes are likely to see similar workload and, hence, the system is likely to be better load-balanced. Different short movie clips place the load
Related Documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!