Saturday, October 18, 2008

TAG

This paper presents a way to collect aggregate values (thus it is called Tiny AGgregation) across sensor networks. The key difference between a sensor network and a normal network is the need to conserve power, which is what this paper addresses by decreasing both the size of packets and the number of packets which must be sent by one of these sensors.

They first present an SQL-like query language, which basically allows the following keywords: SELECT, GROUP BY, HAVING, WHERE, and some aggregation functions (as well as EPOCH and DURATION, which aren't that important to the algorithm). Next, they build a tree of nodes (where each node has exactly one parent, and the root is the user), using an arbitrary routing algorithm. This tree is updated periodically.

Next, they classify the aggregates that they can use. There are several different properties that are important. First is montonicity. Monotonic aggregates include things like MAX and MIN, and their monotonicity can be used to 1) evaluate some HAVING predicates before passing data to a parent, potentially saving one communication and 2) snooping on other nodes to determine whether or not to send your own data. The second property is the type of partial state that must be transmitted per node. The list of types contains Distributive, Algebraic, Holistic, Unique, and Content-Sensitive, which basically boils down to either constant per node or linear in the number of total children. This has a significant affect on the amount of communication and power consumption of nodes, especially parents close to the user. The other properties are Exemplary/Summary and Duplicate Sensitive, which affect optimizations that can be achieved.

Given these properties, each aggregate function is assigned a set of these properties, as well as two functions: a combining function and an evaluating function. The combining function allows each individual node to combine data from multiple children. The evaluation node allows the user to compute their actual desired value. Each node computes which group it belongs to, and separates the data by group id. Each node then hears the data from all of its children, and combines data for identical group ids. It then passes this information on to its parent. This allows computation to be done at the nodes, rather than done by the user, which saves transmissions and therefore power.

No comments: