LAMONAHD

Thursday, October 1, 2009

Modelling and developing the information to manage an Internet Data Centre

Cet article propose un prototype d'un data model destiné à la gestion d'un data center.
Un data center est une zone qui contient des systèmes automatisés qui ont pour but de fournir un ensemble de service comme par exemple le monitoring sur les activités des serveurs, trafique web et la performance de réseau. Ces systèmes reportent toute anomalie aux personnels pour une éventuelle intervention. Un autre exemple peut être le CDN (Content Data Center) qui permet de stocker les données.

Un data center fournit un service avec une garantie de performance. Donc il faut maximiser l'utilisation des ressources tout en minimisant le cout. D'où la nécessité d'administrer ces systèmes.
Cette administration est réalisée à travers un vertex:

Le protocole de communication qui est utilisé pour échanger l'information.
Le format des messages (balise XML par exemple).
L'information qui est défini par un data model.

Vergara et al. exposent deux approches afin d'adopter un data model pour les Data centers:
1 - CIM (Common Information Model) : un modèle de donnée proposé par le DMTF(Distributed Management Task Force) qui rassemble les principaux constructeurs et vendeurs des équipements (système, réseau et d'autres). CIM a pour but d'unifier les data model déjà existants.
Dans le cadre des DC, CIM contient des informations trop détaillés (parfois inutiles) et manquent d'autres informations qui sont très utiles pour la gestion des DC. En plus, l'équipe suggère d'utiliser un système de base de donnée pour stocker et gérer l'information. Les BDs utilisent une approche plutôt relationnelle. Donc un passage de l'approche objet vers l'approche schéma relationnelle est nécessaire. Or la manipulation des objets permet une meilleur flexibilité.

2- Un modèle de donnée sur mesure: l'idée de proposer un autre modèle de donnée n'est pas trop favorable. En effet, le nouveau modèle de donnée ne sera pas un standard donc il faut intégrer à chaque fois qu'un nouveau équipement apparait dans le modèle alors qu'avec le DMTF c'était au constructeurs/vendeurs de s'aligner au modèle de donnée CIM.

Le nouveau modèle de donnée est réalisé à partir d'un mélange des deux approches.
Utiliser CIM tout en appliquant un ensemble d'extensions/réductions afin de l'adapter aux besoins des data centers. Comme CIM est un standard ceci évitera de proposer un nouveau modèle. Donc ils proposent d'adapter CIM au besoin.
CIM offre une approche objet qui permet une meilleure gestion et flexibilité. Les base de donnée relationnelles permettent une meilleur gestion et stockage des informations donc un passage du modèle objet au modèle relationnelle est appliqué.

Ce nouveau modèle de donnée est divisé en deux sous modèles:
Le premier permet de gérer la configuration et le deuxième un modèle de donnée matériel.

Le modèle de donnée de configuration est basé sur CIM mais avec des extensions/réductions
Une fois le modèle de donnée écrit en CIM, ils le translatent en schéma relationnel avec un ensemble de règle.
(Figure CIM étendue avec date de déploiement etc, click pour agrandir)

Un exemple d'une extension de CIM mappé vers le modèle relationnel.

Pour le modèle de donnée matériel, Le DC à HP utilise un système qui scanne tous les serveurs du réseau et génère un fichier XML contenant toutes les informations (espace mémoire, nombre de partitions etc). Le fichier XML a une DTD tout à fait différente de celle de CIM.
Représenter ces informations avec CIM leur semble compliqué. Donc ils décident de mapper de fichier XML vers un modèle relationnel.

Une fois les deux sous modèles sont mappés vers un modèle relationnel alors ces deux modèles peuvent être reliés en utilisant des clés uniques. C.à.d en utilisant un identifiant d'une machine (clé unique) on peut regarder le modèle de configuration et le modèle matériel. Cet identifiant permettra de relier les deux modèles.

Conclusions:
Cet article propose un prototype d'un modèle de donnée constitué de deux sous modèles pour la gestion des DC.
L'approche utilisée consiste à proposer un modèle de configuration basé sur CIM mais plutôt simplifié. Ceci implique qu'en utilisant CIM, ils ne proposent pas un nouveau modèle mais ils ré-affinent un modèle de donnée standardisé et l'utilisent pour leurs besoins.
Le deuxième sous modèle propose de mapper le fichier XML des machines scannées vers un modèle relationnel. Ces deux modèles seront reliés par des identifiants uniques qui permettront de passer de l'un vers l'autre.
Ce n'est pas trop claire comment garantir le même identifiant unique qui permettra de passer d'un modèle à l'autre, cette opération ne nécessite-t-elle pas l'intervention d'un personnel?

Tuesday, September 29, 2009

Toward a Formal Common Information Model Ontology

La problématique posée dans cet article est la suivante:
Comment rendre un système d'administration auto-administrable?
Un tel système est chargé de piloter un ensemble d'équipements ou nœuds hétérogènes (du point de vue fonctionnel, matériel, logiciel, modèle de donnée, propriétés). Ces équipements varient des petits capteurs (température, luminosité etc) vers des routeurs tout en passant par des téléphones portables.
Une opération d'administrations peut être par exemple l'ajout d'une nouvelle fonctionnalité sur un téléphone mobile afin que son utilisateur puisse piloter le déclenchement du ballon l'eau chaude à distance (en dehors de son domicile).
Cette opération nécessite un déploiement de code entre autre sur le téléphone mobile et la gateway responsable de la gestion des équipements de la maison par exemple afin d'arriver au but.

L'auto administration du système peut être par exemple l'amélioration des performances du système pendant une période de pointe toute en proposant une autre configuration du réseau en utilisant au mieux les ressources disponibles.

Le modèle CIM (Common Information Model) est un standard qui a pour but d'avoir un modèle de donnée commun à tout les équipements disponibles dans le système. Ceci permet d'avoir une vue unifiée qui facilitera l'administration de ces équipements sans se préoccuper du protocole d'administration et modèle de donnée utilisés.

Bien que tout ces équipements sont spécifiques à un data model et un protocole d'administration, l'interopérabilité entre data model ne suffit pas pour avoir un système auto-administrable.
En effet, un système auto-administrable doit être en mesure de prendre des "décisions" pour améliorer les performances ou répondre à des critères bien définis.
Or afin d'arriver à cela on a besoin d'exprimer un ensemble de règles, de contraintes et des axiomes afin de permettre au système de raisonner et prendre des "décisions".

Les ontologies sont appliquées dans le Web afin d'améliorer l'expressivité et d'introduire la sémantique dans la description et la présentation des donnée. Ces langages d'ontologies permettent également de formaliser les contraintes, description et axiomes afin de permettre au système de les traiter. L'utilisation des ontologies permet donc de se rapprocher vers l'auto-administration des systèmes.

CIM est positionné d'après cet article dans la classe de "Semi formal ontology", c.à.d CIM manque d'expressivité sur les contraintes, axiomes et règles. En plus, CIM utilise XML et XML Schéma pour représenter et modéliser les données. Or, l'utilisation de ces outils nécessite un travail d'interprétation en avant afin de pouvoir extraire les informations utiles et nécessaires contenues dans les balises (Parsing). Ceci dit, l'échange d'information entre deux entités peut prendre place sauf si ses deux entités ont l'interprétation prédéfini afin d'extraire l'information utile des balises.

Quirolgico et al. proposent de mapper CIM vers du RDF/S (Resource description framework)/Schema). Donc CIM sera écrit en utilisant RDF/S afin de passer d'une semi ontologie vers une ontologie plus ou moins complète. RDF/S étendra CIM pour supporter d'autres proporiétés manquantes comme les axiomes et les contraintes.
RDF est un langage d'assertion qui definit des constructeurs sémantique afin d'exprimer des propositions. Ceci est réalisé à travers 3-tuples (sujet, relation, sujet).

L'utilisation de RDF/S permettra:

d'échanger des informations et raisonner sans avoir une sémantique prédéfini comme dans le cas d'XML (balises).
Ajouter des règles et axiomes pour améliorer le raisonnement du système.
Permettre aux données de l'ontologie CIM d'être combinées avec d'autres vocabulaires ou ontologies "plus facilement".

CIM schéma est décrit en UML alors le mapping de CIM vers RDF/s se basent sur des travaux antérieurs [1, 2, 3] qui proposent des méthodes de translation pour passer d'UML (Unified Modeling Language) vers un language d'ontologie comme RDF/S (Resource description framework)/schéma et OWL (Web Ontology Lanaguage).

La construction de l'ontologie CIM en RDF/S est abandonnée car RDF/S ne permet pas d'exprimer des contraintes sur la cardinalité. En plus ne permet pas de représenter certain aspects spécifiques ) CIM comme "Key qualifier" c.à.d un attribut à valeur unique.
On peut rajouter également que RDF/S ne permet pas d'exprimer des contraintes sur les propriétés, comme par exemple la transitivité si a R b et b R c alors a R c, ou la fonction inverse. etc.

OWL (Web Ontologie Language) qui offre une meilleure expressivité est choisi pour représenter l'ontologie de CIM. OWL ajoutera de la sémantique sur les relations qui relient les classes CIM à travers des propriétés comme owl:TransitiveProperty, owl: inverseOf, owl: EquivalentProperty et d'autres.
Ces propriétés faciliteront le mapping entre deux ontologies présentées en OWL. Par exemple
C1 est_sous_composant_de C2 dans l'ontologie 1 est équivalent à C2 est_composant_père_de C1 dans l'ontologie 2. Donc est_sous_composant_de est une propriété inverse de est_composant_père_de (owl: inverseOf).
Cette translation encore une fois sera basée sur le mapping entre UML et OWL [4].

La construction de l'ontologie CIM avec OWL présente également des limites, quelques aspects dans CIM n'existent pas en UML, RDF/S et OWL comme la valeur par défaut d'un attribut.
En plus, OWL ne permet pas de décrire des clasess abstraites (contrairement à CIM et UML).
Évidemment, on peut rajouter les concepts à l'ontologie mais ceci revient au même cas qu'avec XML/S où on a de la connaissance prédéfini qui doit être partagée par deux ou plusieurs entités afin de pouvoir extraire les informations utiles.

Conclusions:
CIM est bien spécifique au domaine de l'administration des équipements alors que les ontologies sont très génériques parfois et ne permettent pas d'exprimer tout les concepts et propriétés sans avoir de la connaissance prédéfini ou pré partagé afin d'interpréter les informations.

Références:

Baclawski et al.: Extending UML to support ontology engineering for the semantic web. Lecture Notes in Computer Science 2185 (2001).
Chang et al: A discussion of the relationship between RDF-schema and UML, W3C Note (1998).
Cranefield, S.: Networked knowledge representation and exchange using UML and RDF. Journal of Digital Information 1 (2001)
AT&T: OWL Full and UML 2.0 compared. Whitepaper (2004)

Thursday, September 17, 2009

Review: An ontology-based method to merge and map management information models

La problématique est toujours la même: l'interopérabilité des modèles de donnée. Ceci facilite la gestion du système tout en offrant à un administrateur une vue unifiée des entités présentes dans le système. Une vue unifiée permet d'agir sur le système indépendamment de la topologie, architecture (matérielle/logicielle) ou protocole d'administration utilisés derrière.

L'interopérabilité est réalisée par des approches syntaxiques et ne permet pas de garder le sens ou autrement dit la sémantique lors du passage d'un modèle de donnée à un autre. Par conséquence, le mapping réalisé n'est pas complet et les entités translatées sont détachées de leur modèle de donnée de base et perdent leur sens.
Alors, une des approches est d'avoir recours aux ontologies afin de préserver au maximum le sens et d'avoir une translation efficace entre deux data model.

Cet article décrit en bref les étapes essentielles pour faire un mapping entre deux data model.
Pour commencer Vergara et al proposent de translater les modèles de données vers une même ontologie. Ils utilisent DAML+OIL un langage d'ontologie. (DAML+OIL a été remplacé par OWL plus tard). Ils appliquent cette approche sur CIM et Host-REssource-MIB qui sont deux modèles de données et les mappent vers une ontologie commune.
voir figure, clic pour agrandir:
Ces élements sont écrits dans le même language d'ontologie DAML+OIL,
les 3 premiers éléments sont des entités CIM et les 2 dernières des entités Host-Ressource-MIB.

Une fois la réécriture des deux data model en DAML+OIL, ils appliquent leur méthode M&M (Merge & Map) qui d'après eux met en œuvre les deux process en parallèle. Cette méthode utilise un ensemble de règles pour déterminer un match entre deux entités de deux méta model différents.

Ces règles sont basées sur des techniques de mapping et alignement des ontologies, comme par exemple:

un mapping syntaxique: deux entités ont le même nom, partagent un même préfixe, suffixe etc.
des règles sur le type de données afin de déterminer s'il y a un mapping,
deux entités appartiennent au même domaine (ou class).
mapping par hiérarchie: si deux entités ont un mapping alors il y a une forte probabilité que leurs attributs ou sous-entités soient compatibles pour un mapping.
une attention est également accordée pour la conversion des unités (par exemple Bit et Octet, etc) et pour la concaténation des entités, par exemple (Consommation d'energie = consommation écran+ consommation PC).

Une fois le mapping entre les deux modèles de donnée est réalisé, on procède à la translation entre les deux modèles avec des formules écrites en MapTrans (langage fictif similaire à JavaScript). Les formules permettent la translation dans les deux sens.
voir figure, clic pour agrandir:
Une formule écrite en MapTrans, où KernelModeTime(CIM) et UserModeTime(CIM) sont tout les deux mappés vers hrSWRunPerfCPU(Host-resources-MIB).
La règle de conversion est hrSWRunPerfCPU = (KernelModeTime +UserModeTime)*10.

Cette solution nécessite l'intervention/supervision d'un humain pour valider les différentes étapes de mapping "automatique" entre les ontologies. Ce qui peut demander du temps mais ceci reste moins lourd que de faire un mapping à la main.
voir figure, clic pour agrandir, les étapes en gris sont menées par un utilisateur (administrateur) et celles en blanc par le système.

Tuesday, September 15, 2009

Review: Applying the Web Ontology Language to management information definitions

Jorge Vergara et al proposent d'utiliser le langage des ontologies "Web Ontology Language" OWL pour donner plus de sémantique et d'expressivité aux modèles de données.

Ils proposent d'utiliser OWL pour les raisons suivantes:

OWL est basé sur du RDFS et RDF
RDFS et RDF sont basés sur du XML et offrent un ensemble structuré de termes: hierarchie des classes, domaine et contraintes.
XML (eXtended Markup Language) est largement utilisé comme format d'échange, de représentation et d'interprétation d'information. L'utilisation d'XML est favorisée par l'existence d'un grand nombre d'outils et librairies qui faciliteront la validation (DTD) et le traitement des informations. En plus , XML peut être utilisé afin de représenter l'information autrement dans un scénario d'interopérabilité des protocoles d'administrations.

Jorge insiste sur le besoin d'introduire de la sémantique dans les data model existants afin d'arriver à passer de l'un à l'autre facilement. Actuellement, le passage entre 2 data models se fait par du recast autrement dit un mapping syntaxique plutôt manuel.
L'idée est de traduire les data models en OWL.
En effet, les data model actuels sont vu comme des ontologies 'light" puisqu'ils présentent entre autre un ensemble de concepts (class, attributs etc) et d'instance de concepts ainsi que des relations (sous-class, association de class etc). Cependant ces data models ne sont pas assez expressifs, pas de possibilité d'écrire des axiomes.

Or OWL est un langage d'ontologie général et lui manque quelques aspects pour pouvoir traduire les models de données en OWL. Les droits d'accès (Lecture, écriture) , les unités de mesures (Bit/seconde) ainsi que les valeurs par défaut pour les attributs. Les auteurs étendent OWL avec ces aspects tout en se basant sur du RDFS. (voir Fig, clic pour agrandir)

Conclusion:
Cet article présente une approche pour une interopérabilité entre data model tout en se basant sur OWL. Ce langage est basé sur du RDFS qui est capable de décrire un data model par des class, propriété, hiérarchie, domaine et types.
RDFS lui même est basé sur du XML ce qui facilite l'échange et le traitement d'information par des outils déjà existants.
Des extensions ont été ajoutées à OWL (en utilisant RDFS) afin d'arriver à traduire tous les aspects d'un modèle de donnée vers OWL.
Ils proposent également de formuler avec du SWRL (Semantic Web Rule Langage) des règles et des axiomes (plus de détails bientôt).

Thursday, September 10, 2009

Review: Ontologies: Giving Semantics to Network Management Models

Cet article expose la même problématique concernant l'hétérogénéité des data model et l'insuffisance de la translation syntaxique d'un data model à un autre.

Pour commencer il expose l'intérêt d'ajouter de la sémantique tout en passant par les ontologies. Le papier continue par présenter les data model/les Information Model (GDMO, SMI, MIF, IDL, SMIng et CIM). Ensuite, l'auteur continue par une comparaison qui va permettre de savoir si on peux passer d'un modèle de données à un autre facilement. ça sera le cas si les deux data model supportent les même critères (d'ontologie).

Le modèle de donnée CIM (Common Information Model) est le candidat qui répond le mieux aux critères de comparaison (Inspirés par [1] et [2]), du point de vue Meta classes, Partitions, Attributs, Facets, Taxonomies, Relations et Comportement (Axiom et roles). En plus, CIM est le langage qui a le plus d'expressivité sémantique parmi les autres. Mais CIM a besoin des extensions.

Afin d'intégrer les ontologies aux modèles de données, les auteurs proposent deux approches:

Effectuer un mapping entre chaque deux modèle de données.
Définir un modèle de donnée (final ou pivot) qui inclut les autres modèles de données existants.

La seconde approche parait la plus adaptée dans le cas où le nombre des modèles de données est élevé.
Ils proposent de modifier la seconde approche car les sous-ensemble communs entre les modèles de données n'ont pas tous le même poids.
Donc ils prendront le modèle de données le plus grand, celui qui couvre le mieux le domaine d'administration. Ensuite, appliquer une extension avec les petites parties qu'il ne supportent pas afin de définir un modèle de données final.

Ce modèle de données final ou l'ontologie globale est CIM dont on va ajouter les extensions afin qu'il couvre d'autres aspects comme les axiomes et le comportement, contraintes etc. Le but est de pouvoir exprimer des contraintes comme par exemple: "L'espace libre d'une instance CIM_FileSystem doit être plus grande de 10% que la taille d'un sytème de fichier."

Conclusions:
CIM est un modèle de donnée avec une expressivité sémantique plus avancée que les autres modèles de données (GDMO, SMI, MIF, IDL, SMIng). Cependant, il lui manque la définition des taxonomies (subclass of, Not subclass of, composition disjointe (entre deux concepts), etc) et les relations afin de pouvoir exprimer des contraintes.
Lors du mapping entre les modèle de données, on peut utiliser les outils des ontologies déjà existants. D'après les auteurs, cette solution nécessite une intervention manuelle pour valider certaines taches, ce qui peut prendre un temps non négligeable lors de son application à des larges modèles de données.
Mais cette solution reste mieux adaptée que de faire le mapping à la main.

Références:
1 - A roadmap to ontology specification languages, Corcho et al., EKAW'2000.
2 - Ontology based integration of information - a survey on exisiting approaches, Wache et al, IJCAI 2001

Intéopérabilité du nommage et désignation

Contexte:
On s'intéresse particulièrement au problème du nommage et désignation dans les protocoles d'administrations.
Un protocole d'administration est un protocole qui gère un ensemble de fonctions entre deux ou plusieurs entités. Comme par exemple la gestion d'énergie, la qualité de service, le diagnostic, la mise à jour logicielle etc.

Les entités varient des petits devices comme les téléphones portables jusqu'au gros serveurs d'autoconfiguration.
Ces devices supportent un modèle de donnée/ data model/ Information Model spécifique à un protocole d'administration. Le data model spécifie la représentation des données manipulées (Hiérarchique par exemple) et les formes d'accès au valeurs et données (Père.Fils.Class.Paramètre etc).

Les protocoles d'administration sont de plus en plus nombreux et agissent sur un domaine d'administration spécifique. La diversité des protocoles est traduite par une diversité des data model.
Le domaine d'administration représenté par un data model possède le plus souvent un ou plusieurs sous-ensemble qui sont communs aux différents protocoles d'administrations.

Problématique:
Un opérateur de service est amené parfois à administrer des équipements qui ne supportent pas le modèle de donnée spécifique à son protocole d'administration. Alors afin d'arriver à son but d'administration des équipement hétérogènes, une translation entre data model s'impose.
Cependant, la translation syntaxique entre les data model n'est pas suffisante car on perd le sens de la désignation. D'ou la nécessité d'introduire de la sémantique dans les modèles de donnée.

Une piste à explorer est l'utilisation des ontologies. C.à.d mapper les modèles de données vers un langage d'ontologie (OWL) pour ensuite faire le matching entre les deux ontologies. Une fois le matching est effectué, la translation aura lieu.

Friday, June 19, 2009

Google Voice

Interesting article about redirecting and forwarding phone calls to other phones from Google

Link to the article

Monday, June 15, 2009

Google's, Amazon's and Microsoft new Data centers.

Interesting article about Google's, Amazon's and Microsoft new Data centers.

Link

Friday, June 12, 2009

The shadow network architecture

Interesting link reveals the possible use of hidden botnets in networks.

Link

Thursday, June 11, 2009

Review: Design considerations for a network of Information

In this position paper they insist on the fact that the Internet has to be reshaped and be focused on data instead of endpoints, in order to have a data centric network or network of information.
According to Jacobson [1], the first generation of the network dealt with connecting wires. The second one focused on end nodes hosting data while the third generation should refocus on what humans care the most about Information.

Their information models distinguish between 2 main objects:
Data Objects (DO): are the actual bit-patterns that contain the information: such as file, phone call, video, web page, a song(Beethoven's 9th symphony) etc. These data objects can be divided into "chunks" smaller pieces in order to simplify the transfer.
Information Objects (IO): holds semantic information and meta-data related to data objects, such as Beethoven's 9th symphony is an mp3 file encoded with 128kbps, IOs can be composed of other IOs or directly pointing to one or multiple DOs. An IO can represent the Eiffel tower and point to DO like pictures, a wiki page or service to buy tickets, etc.

Versioning and Revocation:
Since some information is frequently changing such as news papers. An IO can represent today's version, however the IO should adapt dynamically by binding to another DO (web page) the next day, and so on for the IO pointing to yesterday's news.
They suggest that objects invalidate themselves in order to conserve consistency. After an amount of time, objects should be recertified before it can be used. By applying this technique, they maintain consistency due to disconnected operation during information update of other replicas for example. DO can be deleted same way when no certification is given.

Security considerations:
Security in today's architecture is based on confidence (Encryption keys) of the host delivering the object, they propose to reshape security conventions so we can handle secured data instead of secured tunnels.
Integrity and authenticity is directly tied to object's names which means there would be a cryptographic relation between the name and object such as self-certifying names. However to enable off-line verification, DO must carry private keys which can be compromised. Another approach is to assign precomputed signatures to objects. It remains a research field.

Name resolution(NR):
Data objects are retrieved based on their unique identity (UID). NR starts with locating the object in the network then routing forwards the object retrieval query to its storage location and finally the DO is sent to the requesting client.
The naming resolution resolves an UID into one or more locations and should work on global and local scale by cooperating between NR systems for example. They show the side effects if ID/address split mechanism is adopted with the following example, if a laptop hosting numerous Data Objects moves its location then all Data objects location changes too. This will lead to huge number of updates in the NR system.
The NR system will be influenced by the characteristics of namespaces. They would like to adopt flat names which respects the non right of ownership and other characteristics revealed by [2].
Off course, using flat names prevents the use of hierarchical names spaces and systems like DNS.
DHT based solutions are promising since the are decentralized, scalable, self-organized and don't need central structure. However when going globally DHT uses flat names hence non hierarchical names which prevents cooperation with other systems.

Routing:
Addressable entities are still increasing and will reach millions even billions in few years with the emergence of sensor networks, Internet of things, growing data etc. They claim that routing research are not encouraging according to [3] (will be reviewed later). Hence, they ll investigate the efficiency of name based routing which integrate both resolution and retrieval paths. Name based routing will locate DO based on their ID by transforming the ID directly into a path without going through ID-address transition. Other techniques such as LLc and NodeID are to be investigated also (Soon will be reviewed).

Storage:
The information network can be implemented following two different models:

Network based storage model where storage resources are provided by the network infrastructure, like dedicated storage servers.

Network managed storage model where network nodes control portions of storage memory of users connected to the network. Users will be able to decide what DO goes public or be shared only with friends etc.

Search:
Search systems are expected to go far beyond text match search, such as semantic search or even search functionality based on GPS position, location positioning. For example when a picture of Eiffel tower is taken, a search mechanism will handle the identification of the monument based on GPS or other techniques and points to DO related informations such as web page, history etc.

This position paper gives many ideas and anticipations about the future Internet architecture and reveals the weakness in the current addressing system. They distinguish between DO and IO and argued that a network of information needs a scalable naming system supported by an efficient routing system.

References:
1 - V. Jacobson, M. Mosko, D. Smetters, and J. Garcia-Luna-Aceves. Content-centric networking. Whitepaper, Palo Alto Research Center, Jan. 2007.
2 - M. Walfish, H. Balakrishnan, and S. Shenker. Untangling the web from DNS. In NSDI’04: Proc. 1st Symp. on Networked Systems Design and Implementation, San Francisco, CA, USA, 2004.

Link to the article

Thursday, May 28, 2009

Review: Survey and Taxonomy of IP Address Lookup Algorithms

I will first start with some basic definitions:

Address Lookup: is when a router uses the packet's destination address as a key to consults its forwarding table in order to decide the next packet's destination.

A trie is a tree based data structure allowing the organization of prefixes on a digital basis by using the bits of prefixes to direct the branching. A node on level l represents the set of all addresses that begin with the sequence of l bits consisting of the string of bits labeling the path from the root to the node. (According to the survey).

Address Class Scheme: In IPv4 the address was first designed as a two part identifiers: the prefix designate the network and a suffix that identifies a host. The prefix refers to class category A (8 bits), B (16 bits) or C (24 bits). The prefix 130.86/16 is a class B with a 16 bit length (number after the slash).

Address aggregation: Since routers forwards packets based on the prefix IP part, forwarding tables need only to store a single entry in order to forward packets to all hosts attached to the same network. This technique is called address aggregation.

CIDR (Class Inter Domain Routing) Address Scheme: the CIDR was introduced to slow down the growth of the backbone forwarding tables. CIDR prefixes can be of arbitrary length rather than (classified addresses) 8, 16 or 24 bits long. Hence, this will allow recursive address aggregation. Forwarding Tables will reduce the number of entries of the backbone routers and will hold aggregate information rather than network information. Example: Consider the following networks 208.12.16/24 through 208.12.31/24 which are reachable from the same service provider. These networks share the same leftmost 20 bits. Instead of storing all these networks IP in a backbone router, these addresses can be aggregated into one "super router" represented by the same 20 bit prefix 208.12.16/20 and the backbone router will hold only one entry to the "super router".

Network renumbering: this operation occurs when a network changes its service provider. Hence the change of the IP prefix which imposes the change of the IP addresses assigned in the network, this is called network renumbering.

What is the longest prefix matching?

CIDR scheme reduces the forwarding tables through aggregations, however few elements can still interfere with the process of aggregation. Such as when a network 208.12.21/24 changes its service provider but still wants to use the same IP prefix and IP addresses assigned in the network. Consequently all networks from 208.12.16/24 through 208.12.31/24 can still be reached from the same service provider except 208.12.21/24 which break the aggregation process. The "super router" cannot aggregate 208.12.16/20 anymore, it will store in its forwarding table both 208.12.16/20 and the exception entry 208.12.21/24. Since the two entries share the same 20-bit prefix, when matching we need to look for the longest prefix in order to find the most specific forwarding information to the correct destination. Therefore, we are talking about a search in two dimensions: value dimension and length dimension.

Why do we need scalable and fast address lookup algorithms?

Number of entries in routing tables are still growing due to the increasing number of networks and exceptions.
Increasing number of update operations in routing tables. Forwarding tables needs to be updated dynamically to reflect the route change. According to [1] route changes occurs 100 times/second. Hence update operations must be fast (less than 10 msec).

Solutions:

Binary Trie:
In a binary trie, every node has at most 2 branches with values (0/1). The search starts at the root level and goes to the left or to the right according to the bits of the destination address. While traversing the trie, every time we visit a node marked as prefix (nodes with letters inside), the prefix is remembered as the long prefix visited so far. The search ends when there is no branches matching the next bit in the destination address and the best matching prefix (BMP) is the last prefix remembered. The Binary trie is a sequential prefix search by length, at each step the search space is reduced hierarchically . Update operations are simple, inserting a prefix begins with a search until the arrival to a node with no branch the node can be inserted. Deleting operations starts also with a search unmarking the node as a prefix and deleting unused nodes. Nodes don't store prefixes since the prefixes are represented by the trie structure.

Path-compressed Tries has been introduced to improve space and time performance for tries with sequences of one-child node. The prefix b in the figure above shows a one-child node sequence. When searching in a binary trie, The bits separating prefix a from b need to be inspected even though no other branches exist. The additional search is a waste of time and the additional nodes (between prefix b and a) consume additional memory. In a path-compressed trie, one-child nodes are removed, when performing a search we can jump directly to the bit where a decision can be made. Hence, a bit number field is kept to indicate the next bit to inspect. In a path-compressed trie, the bit-number field of the nodes traversed indicates the position to inspect and the bit value in the branch reveals the destination to follow (left if 0, right if 1) depending on the inspection result. When a node marked as prefix is encountered, a comparison with the actual prefix value is performed. This step is essential since we are skipping some bits. When a match is found, the search continues and the prefix is kept as the BMP so far. The search ends when a mismatch or a leaf is encountered. Path compression is useful when the binary trie is not well populated because using comparison has little benefits. In general binary and path-compressed tries requires many memory access. (worst case 32 for IPv4 address).

Simple Search on values approach:
The data structure needed is just an array containing unordered prefixes. The search goes through all entries comparing destination address. When a prefix match is found, it keeps the longest match. At the end, the last prefix remembered is the BMP. Clearly this scheme is a function of the number of entries and is not scalable.

Prefix transformation:
Forwarding information is based on prefixes that represents address ranges. One of the most common prefix transformation techniques is prefix expansion which expands one prefix into several longer and more specific prefixes that cover the same range of addresses. Appropriate prefix expansion results in fewer different lengths and can be used to make a faster search. In order to avoid the longest prefix match a method of disjoint prefixes is used. Disjoint prefixes do not overlap and no prefix is a prefix of another one. A trie representing disjoint prefixes will have prefixes at the leaves but not at internal nodes. A disjoint-prefix binary trie is obtained by adding leaves to nodes with only one-child. The following figure is a transformation of the figure above. Prefixes a1, a2 and a3 replaced prefix a. Prefixes are pushed from internal nodes to leaves. (Which will make prefixes disjoint).

Search on prefix lengths using Multibit Tries:
Multibit trie is a trie structure that allows multibit inspection at each step. Multibit tries increase the search speed, instead of inspecting one bit per step it allows multibit inspection. The number k of bits allowed per inspection is called stride. Every node in a multibit trie has at most 2^k child. Since search traverses strides of several bits at a time, multibit tries cannot support arbitrary prefix length. Height of the trie (comparing to binary tries) decreases and so the number of memory access when searching.
The strides choice influence the search speed and memory consumption. Actually if the stride increases so does the search speed (trie’s height decreases) and memory consumption to store larger amounts of entries. With a stride of 32 bit for IPv4, there is only one level access but memory to store 2^32 entries is required.
When prefix expansion is applied, the local BMP for each node of the subtrie is computed. Inserting or deleting a prefix needs to be applied on the subtries. Hence prefix update is local, if the new prefix will be stored in a subtrie with a stride of k bits then the update requires to modify at most 2^(k-1) nodes.
Choosing larger strides will make faster searches but more memory is needed and updates will modify more entries due to expansion.
Multibit based solutions emerged in order to optimize memory consumption and increase search speed such as the Level Compressed trie, Lulea scheme, Full Expansion/Compression and many others. However these approaches are inadequate when it comes to update operations, rebuilding the whole structure is more often the only solution.

Binary Search on Prefix Lengths:
Waldvogel [2] proposes another method for prefix lookup without using a trie structure. He proposes to organize prefixes in different tables according to their lengths and to use a binary search to reduce the search space in half in each step. A comparison is applied in each step in order to decide to which half the search should proceed. If a match is found, then the search can proceed to longer prefixes but if there is no match than no decision can be made regarding the direction to take.(Longer or shorter prefixes?). Waldvogel proposes to use precomputed BMP on each prefix and extra prefixes called markers to be sure that when no match is found the binary search should proceed towards shorter prefixes.
For example, if we search the BMP for the address 11000010, since the length is 8 we ll start searching in the table prefix with length 4.(half length). We compare entries with 1100, a match is found with prefix f then f is a BMP so far and we can proceed with 110000 (the other half is 4 bit length, we take a half and continue the search it means we have a 6 bit length address). If another match occurs with a prefix when searching in the 6 bit length tables then the prefix becomes the BMP so far. The other part of the address left is 10, 2 bit length, then we proceed with address 1100001 in the 7 bit length tables etc. When going towards a next step extra markers prefixes are inserted in the adequate length in order to direct the search to shorter (resp. longer) prefixes in there is a no match (resp. match). Usually I think that extra markers are inserted when the one child subtrie situation occurs.
The markers and precomputed BMP increases the memory usage and make updates more difficult because these values need also to be updated.

Prefix Range Search:
The prefix range search flatten the two dimension (length and value) search into only one dimension by getting rid of the length dimension. One way of doing this is by applying a prefix full expansion in order to have the same length of all prefixes (expansion to 32 bit address IPv4).
Once the transformation is applied, we can proceed with range search. Every prefix determines a well defined range of address. For example for 5 bit length address, a prefix a= 0* would define the range [0,15] since after expansion the prefix range would be [00000,01111]. Hence, instead of storing addresses we will store range endpoints. The BMP of the endpoints is in theory the same of all addresses in the same range. Finding a BMP for an address would be reduced to finding the endpoints of the corresponding interval. However, this technique will not work because prefix ranges may overlap. Range addresses are not disjoint, a range may be contained in another range.
However, endpoints divide the total address space into disjoint basic intervals. The BMP of an address can be found by using the endpoints of the basic intervals (see figure below). Some basic intervals do not have explicit endpoints, so it maintains two BMPs.

The figure below shows a search tree. For example, if we search the BMP for the address (22) 10110, the search begins with the key value 26 and the search proceeds following the comparison decision until arriving to the BMP value "d".

Since the BMP for each basic interval needs to be precomputed, update operation requires to recompute the BMP for many basic intervals.
A multiway search tree can be used just like the multibit trie in order to accelerate the search speed and reduce the tree's height.

IPv6 related:
According to the survey, multibit tries scale badly with longer addresses and hence prefixes when dealing with IPv6 128 bit addresses while the range search seems to be a good approach to face scalability issues for IPv6.

This survey reveals important limitations of the Internet infrastructure and address classification design in IP addresses. It exposes the challenges facing routers and Internet growth. It points to the need of a scalable algorithm that increases the search speed of addresses and reduces memory consumption and access. Such algorithm should handle incremental updates with a minimum change to the structure.
In my opinion, every naming system will face such challenges when dealing with delivering packets to the appropriate destination or retrieving pointers through addresses as keys.
Other reviews will be added to cover the state of the art of such algorithms proposed beyond year 2001.

References:
1 - C. Labovitz, “Scalability of the Internet Backbone Routing Infrastructure,” Ph. D. Thesis, University of Michigan, 1999.
2- M. Waldvogel, G. Varghese, J. Turner, and B. Plattner, “Scalable High Speed IP Routing Lookups,” Proceedings
of ACM SIGCOMM’97, pp. 25-36, September 1997.

Link to the survey

Wednesday, May 27, 2009

Review: RFC 4941 Privacy Extensions for Stateless Address Autoconfiguration in IPv6

RFC 4941 reveals privacy issues related to IPv6 stateless auto configuration.

The IPv6 stateless auto configuration allows hosts to generate IP addresses without the need of a central node to coordinate and distribute unique addresses. The mechanism allows to generate a unique IP address by using the IEEE interface identifier. Actually the interface identifier is generated based on address identifier such as a MAC address which is supposed to be unique, a characteristic guaranteed by the constructor/manufacturer of the network card.
Using a unique interface identifier allows nodes to generate unique IPv6 addresses. The Node adds the network prefix to the interface identifier in order to obtain a 128 bit unique (local or global) address.
Consequently, when a node moves to another network the prefix is the only changing part in the IPv6 address since the interface identifier is always the same and unique.
This unchanging part of the IP address allows for tracking and user localization which violates privacy. ( An employee is at home?, active?, with whom he is communicating? etc).
This privacy problem occurred in the IPv6 and did not exist in the IPv4 which assigned IP addresses independently from the interface identifier.

Any possible Approaches?

Use DHCPv6 to assign and manage addresses. Those addresses are also temporary and never changed. RFC 4941 claims to propose a similar DHCPv6 approach when using temporary addresses.
Change the interface identifier portion of the address over time and generate new addresses from the interface identifier.
Caller ID approach: Many machines function as clients and servers. When acting as a server the machine would need a DNS name. The privacy issue appears when the machine is acting like a client and its identity is revealed. (The similarity with the caller ID approach is when a user lists his telephone numbers publicly but disable the display of its number when initiating calls.)

RFC 4941 proposal:
Their approach proposes the generation of Temporary addresses, a pseudo-random sequence of interface identifiers using the MD5 hash. These addresses would be used for a short period of time. New temporary addresses will be generated to replace the expired ones. Nodes concerned about privacy may use different interface identifiers on different prefixes.

Generation of Randomized Interface Identifiers:
They propose to use 2 approaches in order to generate randomized interface identifiers.

The first requires a 64 bit stable storage for generated temporary addresses, so the new generated address is based on the previous one. This technique prevents two nodes of generating the same random number.
The non stable storage technique will use configuration parameters like user ID, serial numbers with a randomized data and an MD5 algorithm is order to generate random numbers.
Alternate approaches can be used like CGA (Cryptographically Generated Addresses) to generate a random interface ID based on the node's public key. The purpose is to prove ownership of an IPv6 address and prevents stealing and spoofing of addresses. However this technique requires that a node holds a public key. The node can still be identified by its key (transactions etc). The process is intensive and discourages frequent regeneration. (Especially on low cost machines).

However, every time a node generates a value it should checks that the address in not reserved for particular usage or already assigned. If so, the node should repeat the process of generating random interface identifiers. The node also generates temporary addresses of every public address created. New Temporary addresses should be generated to replace expired ones.

The use of temporary addresses is an approach proposed to resolve privacy issues, however this solution have the following impacts on the Internet:

The widespread use of temporary addresses complicating the flexibility of generating global unique addresses from interface identifiers since for each generated address DAD should be applied.
Clients having their addresses changing over time will make packet tracking more difficult and so debugging when unknown behaviors occurs. Hence the packet's source cannot be determined if it is from one machine or multiple ones.
Some servers refuses access to clients for which no DNS names exists. Temporary addresses are not registered.
How to distinguish in a large network with a high rate of changing temporary addresses between the new generated addresses and spoofed addresses.

We can see clearly that stateless auto configuration generates addresses without the need of a central node and no need to apply DAD. However this unchanging part is causing privacy issue and permits identification and node tacking across the Internet due to the prefix modification when entering another network.
This prefix modification is the result of the hierarchical nature of the Internet addresses to facilitate routing and delegate address management between organizations. An unstructured architecture will hide the topology of the network but will add more burden on routers in order to transfer and deliver packets.
A new naming and addressing approach should make a good balance between simplicity and flexibility in generating global unique addresses in a distributed, self manner and providing privacy hence not exposing identity and node's location.

Link to the RFC

Tuesday, May 26, 2009

Review: RFC 4862 - IPv6 Stateless Address Autoconfiguration

RFC 4862 specifies the steps for a host (except routers) to apply in order to auto configure its interface and generate an IPv6 address.

Why Stateless auto configuration?

Assign a unique address to an interface.
Hosts in small networks would not require a DHCP server or a router to obtain a unique address. Such hosts should be able to generate unique addresses in the network.
Large networks will not require DHCP servers for address auto configuration. Hosts should be able to generate global unique addresses.
Facilitate address renumbering on a site or subnet. A site renumber it's nodes when it switches to a new network service provider (IP prefix modification). When renumbering, old IP address will coexist with the new IP address during a period until the old IP becomes invalid.

How stateless auto configuration works?
We can divide the process of auto configuration into the following phases:

A node generate a link-local address when its interface becomes enabled. (Reboot, start time, attachment to another link etc). This link-local address is formed by concatenating the interface identifier (generated from the interface MAC address for example) to the well-know link-local prefix 0xFE80::0.
The node checks if the address is unique by using Duplication Address Detection (DAD) techniques. If the address is unique then it will be assigned to the interface, if not another interface identifier is required to generate a unique link-local address. Administrators can supply an alternate interface identifier. If not a manual configuration is required. In this phase, the node generates a link-local unique address.
Nodes will listen to routers advertisements holding information to generate global addresses. Solicitation messages can also be sent by nodes to routers to avoid waiting for advertisement messages. If some specific flags are set then the node can use a prefix carried in the advertisement and apply it to the generated address. (Prefix concatenated with Interface ID). Actually this prefix is usually the subnet's prefix.
Duplication Address Detection is required before assigning the global unique address to the interface. Some implementations applies DAD only to link local addresses and assumes it is globally unique if it passes the local test. However new techniques have been developed for privacy protection issues. An interface identifier can be generated randomly for example, then a clash with another global address might occurs [RFC 4941 - Privacy extensions for stateless address autoconfiguration].

Nodes still listens to routers advertisements mainly to reset/increase the prefix lifetime or valid time. When the prefix advertised is different from the one generated, then it is a renumbering case, the node will form a new address (new prefix, Interface ID) and adds it to the list of addresses assigned to the interface.

According to this RFC, the Duplication Address Detection is not fully reliable, it will generate a large overhead when testing a global unique IP address.
Other mechanisms to detect address collision should be pushed further.
Site renumbering is another argument to be added to the list in order to separate a host's name from its identity.
Generating the same suffix every time can expose the identity and location of a node which in some cases can be unwanted.

Link to RFC 4862

Monday, May 25, 2009

Review: 6LowPAN

Why IPv6 can’t be applied directly on LowPANs?
6LoWPAN defines IPv6 protocol over Low power Wireless Personal Area Networks.
LowPAN devices uses IEEE 802.15.4 radios. In order to understand why we
can’t use IPv6 directly on top of LowPAN devices let’s list the characteristics of such
devices [2]:

Small packet size. 81 bytes for data packets in the networking layer. (The maximum physical layer packet is 127 bytes, consequently the maximum frame size at the media access control layer is 102 octets. 21 bytes at maximum are usedfor securing link-layer communications).
Support for both 16-bit short or IEEE 64-bit MAC addresses.
Low cost devices: low bandwidth with data rates of (250 kbps, 40 kbps, 20 kbps) for (2.4 GHz, 915 MHz, 868 MHz). Low power typically battery dependent. Low processing and storage capabilities (8KB RAM, limited buffering, etc).
Topologies include star and mesh operation.
Large number of devices expected to be deployed.

The main problems with IP for LowPANs are the following:
The term Maximum Transmission Unit (MTU) refers to the size (in bytes) of the largest PDU that a given layer of a communications protocol can pass to other layers. A higher MTU brings greater efficiency because each packet carries more user data than protocol overheads. Large packets can occupy a slow link for some time, causing greater delays to following packets and increasing lag and minimum latency.
IPv6 protocol uses 128 bits IP address and a header of 40 bytes long. The MTU is at least 1280 bytes in order to maximize the efficiency of the transmission ((useful data)/(overhead data)).
LowPAN devices supports small packets with 81 bytes available for the networking layer and above. Using IPv6 on IEEE 802.15.4 leaves only 41 bytes for transport and applications layers. This is obviously not enough for data exchange, packet fragmentation and reassembly is needed but will use even more bytes. An IP header compression is needed.
Since large number of devices will be deployed, address auto configuration is attractive because it will reduce the overhead between devices. There is a need for a method to generate and assign an Identifier from the EUI-64 bits to a LowPAN device.
Routing protocols in mesh and star networks must be adapted to a small overhead. LowPAN devices have limited resources (memory, bandwidth, CPU, energy), processing 128 bits addresses and large headers will decrease the system’s efficiency and increase data treatment latency.

Why use IP based protocol in LowPANs?
The benefits of using IP based networks are the following [1][2]:

The hierarchy of naming and addressing in IP networks which simplify the connectivity model.
IP-based technologies already exist, are well-known, and proven to be working.
IP networking technology is specified in open and freely available specifications.
Use existing tools for diagnostics, management, and debugging of IP networks instead of designing and developing new ones.
IP-based devices can be connected easily to other IP-based networks, without the need for intermediate entities like translation gateways or proxies.

Addressing Modes
IEEE 802.15.4 uses IEEE 64 bit and 16 bit addresses. Short addresses are assigned by a PAN coordinator during an event which means that validity and uniqueness of such addresses are limited by the lifetime of the association, failure of the coordinator, etc [3].
For short addresses (16 bits), a pseudo 48 bit address is formed by concatenating 16 zero bits to the 16 bit PAN ID (Personal Area Network). If no PAN ID assigned then 16 bits of zeros are used and the resulting concatenation is a 32 bits address. These 32 bits are concatenated with the short address in order to obtain a 48 bit address.
From a 48 bit address, a 64 bit interface identifier is formed as in [5] and [6] by adding 0xFFFE in the middle of the 48 bits (24 bits,0xFFFE,24bits). IPv6 local addresses are formed using the 64 interface identifier by appending the prefix FE80::/64. (0xFE80::EUI-64) or global addresses based on information advertised by routers [7].
Routers are the link between IP and LowPAN networks, those routers will handle address transition.

Header types
6LowPAN proposes specific header encoding and compression mechanisms to adapt IPv6 into IEEE 802.15.4 frames. Separating headers will reduce overhead. If a device is sending short packets directly to another node it does not pay for extra fields such as Mesh networking or fragmentation. The overhead reduction is an energy saving.
The header types are the following:

The Dispatch Header (1 byte), define the type of header to follow. The dispatchheader is identified by the first two bits set to either 00 (non-6LowPAN frames)or 01. The remaining 6 bits indicate if the following field is an uncompressedIPv6 header or an HC1 header (IPv6 compressed header). To accomplish compression[1] the protocol uses a combination of the following facts: the low order64 bits of an IPv6 address (the link local address) can be the device’s MAC address,the 802.15.4 frame carries these MAC addresses, a number of the fields inthe IPv6 header are static.Combining all of these features allows the protocol to compress the standard 40 byte IPv6 header down to just 2 bytes (including the HC1 Header byte) for most intra-PAN unicast communication where source and destination addresses are deleted and generated from Link level frames (IEEE 802.15.4). All of the rest of the fields can be reconstituted without any state information at any of the receiving or intermediate nodes. Additionally by assigning the link local address to the device’s MAC address 6lowpan can use Stateless Address Auto configuration(Zero-conf) and eliminates the need to infrastructure servers like DHCP servers.
The Mesh Header (4 bytes) is used to encode the hop limit and the source and destination of the packet. It includes two single bit fields to indicate if the originating
or final address is a short or long address. The “hops left” field is a 4 bit
field used to limit the number of intermediate hops between the source and destination.
The value of 0xF was reserved to indicate that an extra byte is included
allowing for network depths of up to 255 hops.
The Fragmentation Header (4 bytes for the first fragment and 5 bytes for subsequent
fragments) supports the fragmentation and reassembly of frames larger
than the size of the 802.15.4 frame.

Today there is at least 6 implementations of 6LowPAN on multiple 802.15.4 radio platforms. The working group is still continuing to investigate the areas of neighbor discovery: IPv6 network prefix, local routers and other network configurations parameters.
The area of service discovery to locate other sensors and controllers and higher layer services.
Is it a good choice to push further IP based protocols to other areas such as LowPAN while efforts increases to redesign the Internet?
Should we inherit the limitation of IP which merges between naming and addressing to LowPANs?
What about cross-layering violation while compressing Ipv6 headers and regenerating source and destination address for Link level frames? According to the OSI scheme, Layers should be independent and unable to understand other layer’s data.
Even though in some cases header compression is efficient (with layer violation) other compressions remains unoptimized in LowPAN. Additional work should push routing protocols and reduce overhead of such protocols.

References:
1. The 6LoWPAN Architecture, Geoff Mulligan and 6LoWPAN Working Group, EmNets '07: Proceedings of the 4th workshop on Embedded networked sensors
2. RFC 4919: IPv6 over Low-Power Wireless Personal Area Networks (6LoWPANs):
Overview, Assumptions, Problem Statement, and Goals.
3. RFC4944 - Transmission of IPv6 Packets over IEEE 802.15.4 Networks
4. 6LoWPAN: Incorporating IEEE 802.15.4 into the IP architecture Internet Protocol
for Smart Objects (IPSO) alliance.
5. RFC 2464: Transmission of IPv6 Packets over Ethernet Networks
6. http://technet.microsoft.com/en-us/library/cc736439(WS.10).aspx
7. RFC 4862: IPv6 Stateless Address Auto configuration

Monday, May 11, 2009

List of Projects for re-designing the Internet

Trilogy Project

4WARD

Thursday, May 7, 2009

The Internet sky really is falling

This is really an intresting short review about the internet limitations.

Link to the page

Tuesday, April 21, 2009

Review: SENS - a Scalable and Expressive Naming System using CAN Routing Algorithm

This paper proposes a Scalable and Expressive Naming System (SENS) based on CAN routing algorithm. SENS is a descriptive naming scheme that uses a tuple of attribute/value pairs to name each resource. A resource such a computer is named as ( String OS= "Linux", string CPU-name = "Pentium 4", etc). Those informations are stored at a large number of name servers (NS).

Their design claims to achieve scalable and efficient resource information distribution and retrieval. SENS handles exact (MEMORY = 512 MB) and multi-attribute range queries (MEMORY > 512 MB) with small overhead as well as load balancing.

DESIGN OF SENS:
Mapping resource names to resource IDs:
A resource ID is considered as a set of d coordinates of a point in the d-dimensional resource ID space. (In the example below d = 6). A resource name is mapped to a resource ID by assigning the hash value of each attribute/value (a/v) pairs of the resource name to a coordinate value of the resource ID. Name servers are responsible for resource ID sets just like in the CAN system.
Ha is a function that uniformly hashes every attribute from 1 to d and Hv hashes every attribute value in a [1, 2^(m-1)] interval where m is the maximum size of coordinate value in bits.

If multiple attributes in a resource name are hashed to the same value Ha (attr i) = Ha (attr j) then the corresponding attribute values will be mapped to multiple coordinate values in the same dimension. This means that resource names are mapped to multiple resource IDs which are distributed on NSs.

Their mapping scheme is not injective, several resource names can be mapped to the same resource ID. Consequently resource ID is not a unique identifier of resource name, in order to identify a resource, the resource_ID (resulting form Hash) and its name are required to uniquely identify a resource.

In the case of numerical attribute values, a locality preserving hashing function is used. Such hashing function is defined as if (val1 > val2) then (Hval1 > Hval2). the main purpose behind using a locality preserving hashing function is to deal with range queries, in fact it ensures that a resource ID will be in a interval of resource IDs between a min and max value. By doing this they limit the number of NSs responsible for a query range.

If the attribute/value pairs number is lower than d then the resource ID is filled with zeros.
However when the attribute/value pairs number is higher than d then the set of attributes should be divided to multiple sets of attributes which corresponds to multiple resource names. (this aspect of fragmentation is not treated in this paper).

Resource Information Distribution:
Since their scheme is CAN based, zones are assigned to NS, consequently each NS manages resource information according to the resource ID.
In the case of a resource name mapping to multiple resource IDs. If the NS is responsible for several resource IDs of the same resource, only one copy in maintained at the NS. When resource IDs belongs to different zones i.e. different NS, they use a multicast routing algorithm based on Spanning Binomial Trees. Their algorithm sends minimum amount of messages to deliver information to a NS.
Resource IDs corresponding to a resource name construct a hypercube in the resource ID space according to this article (Optimum Broadcasting and Personalized Communication in the Hypercube) . (They don't explain how the hypercube is built, so further reviews will detail the construction algorithm of hypercubes).

The registration message containing information of the resource is first delivered to the NS responsible for the resource ID created from the lowest values of each resource IDs coordinates. (In this example root node is (0.0.0)). This NS becomes the root node and forwards the message to its descendants according to the tree. Those descendants also will forward the registration message to their descendants according to the tree. etc

Query resolution:
SENS supports:

Exact queries: A query host sends a message to a NS which will map the query resource name to resource IDs and select the nearest destination resource ID. The message arrives to destination using the CAN routing algorithm. The NS responsible for the resource ID will lookup its database to find the queried information and send it back to the initial NS (the one first queried by the host).

Range queries, in the case of a range query it will be limited by the hash values of the upper and lower limit of the queried value ranges in each dimension. When a host sends a range query message to a NS, the latter will map the query range to a range query segment in the resource ID space. A query message will be broadcasted to all NSs whose zones overlap the segment. They propose a broadcasting algorithm based on the SBT and hypercube in order to reduce the number of messages broadcasted. (Further details will be added once I read the article that treats the hypercube and SBT formation).

Related works:
This article propose a more expressive naming scheme than DNS which offers a value-limited resource name space without the possibility to realize the range query.
Other systems such as the Intentional naming system uses a descriptive name space based also on attribute/value pairs. The message routing for a name query is realized by look up of the query name on forwarding tables. Main limitation of such systems is scalability since the forwarding tables will grow with the number of resource names.
DHT-based routing protocols like Chord, CAN achieve a scalable and efficient lookup by hashing a resource name to a DHT key. Range query is the limitation of such systems, a query may spread to the hole DHT key space.
Other systems proposes range queries like MAAN where nodes are responsible for attribute values of an attribute in the query range. The node responsible for the attribute/value pairs (String OS = "Linux") must keeps information related to Linux OS. The main limitation is load balancing since popular attribute/value pairs may appear in resource names with high probability.

SENS is naming system capable of handling resource information with exact and multi-attribute range queries. They propose a multicast/broadcast algorithm to deliver/retrieve information.
However some issues remains unclear:
Several resource names might be mapped to same resource ID and the unique identification is done by resource ID and resource name. Is it enough to uniquely identify a resource? Is it possible to map 2 different resource names having same attribute values in common into the same resource ID? In 2005 Xiaoyun Wang and Hongbo Yu achieved collision on purpose in the MD5 hashing algorithm (2 different values were hashed to same ID).
Is SENS trying to merge between search engines and DNS-like systems?
Do we really need to merge between such systems? is it faster ? More reliable? More scalable ?
Is it a good approach when a resource is mapped to many resource IDs? Having multiple IDs will accelerate routing and finding the nearest ID? What if we mapped essential and important data to multiple IDs and restrained non important data to a single resource ID? How to design such a ranking system to classify data according to it's importance?
What if I updated my RAM from 512 MB to 1024 MB, how SENS manages such updates?
Is space partition efficient? when a node joins the system, a lot of information is handled to the new joining node ? Consequently huge messages are transmitted to the new node.

Link to the article

Friday, April 17, 2009

Review: The SATIN Component System—A Metamodel for Engineering Adaptable Mobile Systems

Readers might be wondering why is he writing about components now, well I am interested in the meta models. Such models can be useful when designing a naming and addressing system, the meta-data associated to each component will help to find a specific component.

This article suggests a component model in order to deal with mobile adaptation.
Applications are more often monolithic (one code block), which means hard to update and maintain when an event or change of context occurs.
Their model allows logical mobility defined as the migration of a partial or complete application or process from one host to another.

Component based development divide the system into interacting components with well defined interfaces. Traditional approaches do not support code migration and execution at runtime. They apply this mobility to components.

The SATIN component model allows to send and receive components dynamically at runtime. Instead of relying on the invocation of remote services, SATIN components are remotely cloned and instantiated locally.
This solution provides autonomy when networks are disconnected. Applications can introspect which components are available locally for a task, and dynamically change the system configuration by adding or removing components.

Component metadata is essential in order to best describe component's properties which are a set of its attributes (ID, version, dependencies with other components, hardware etc). The SATIN component system use attributes similar to the Debian packaging system. This allows applications running on heterogeneous hardware/framework to decide whether on not to deploy a component.
The central component of a SATIN system is the container component which acts as a registry, holds a reference to each component installed on the system. A registrar component is used for loading, removing components, managing dependencies etc.

They consider four aspects of Logical Mobility: components, classes, Instances and data types which is a bit stream. A Logical Mobility Entity (LME) is an abstract generalization of a class, instance or data types. Logical Mobility Unit (LMU) is an encapsulation of LMEs. LMU provides operations that permit inspection of the content, a handler class to deploy and manipulate it contents. Properties (set of attributes) of LMU are used to express dependencies, target environment etc. These LMUs and its contents are serialized/deserialized for transfer and deployment between physical hosts. LMU is always deployed in a reflective component which is able to adapt on runtime by receiving LMU from other hosts.
When a change in the context/environment appears (user/component triggered), the system asks the deployer for a component satisfying a set of attributes. The local deployer contacts a remote deployer and asks for a component. The container packs an LMU and sends it back to the host.
Component discovery and advertising is general. It can uses Jini, UPnP, etc.
In their implementation they used IP Multicast advertising group and a simple centralized publish/subscribe protocol.

So far my knowledge on mobile adaptively component systems is limited, comments on this topic will be added further.
However classic issues deserves to be explored such as:
How much time is needed to transfer a component in a mobile system?
Can we and how to guarantee such time in a dynamic and mobile system. Mobile nodes appears and disappears, the transfer of LMU will be interrupted due to node departure (this is MANET similar issues).
How to advertise for services and new nodes with a minimum overhead, message exchange between nodes?
How to manage resources allocated for every component in a limited resource environment/platforms?

Link to the article

Wednesday, April 15, 2009

Review: Toward a search architecture for software components

This paper proposes a design of a component search engine for Grid applications.
With the development of the component based programming model, applications are going to be more dynamically formed with the associations of components. Developers should be able to reuse already developed components that matches their needs. To do so, a component search engine seems to be essential.
The component search for Grid applications offers two facilities:

Developers will be able to find the best component for their need.
The framework can replace a malfunctioning or a slow component dynamically (at run time). The application should be able to decide the best component to be replaced with the malfunctioning.

They assume that open source Grid applications will appear and software components can be found on portals. These components will be ranked according to their usage, the more a component is used by applications the more important it is considered. This raking will establish a trust index. This approach is used by Google to rank the pages and improve search results.

One of the related works:
Agora components search engine supports the location and indexing of components and the search and retrieval of a component. Agora discovers automatically sites containing software components by crawling the web (Google's web crawler), when it finds a page containing an Applet tag, it downloads and indexes the related component. Agora supports JavaBeans and CORBA components. The database search is keyword based refined by users.

Workflows:
A workflow can be described as a process description of how tasks are done, by whom, in what order and how quickly.
Workflows are represented with low level languages such as BPEL4WS which requires too much user effort to describe a simple workflow.
Other high level language and Graphical User Interface on top of BPEL4WS are being introduced/build that generates BPEL code.

Their approach is workflow based: components can be adapted and coordinated through workflows. Applications should be able to choose and bind with other components from different sources on the Grid. Such applications searches first in its own local repository for components previously used or installed and uses a search engine to find suitable components.

The application development process can be divided into 3 stages:

Application Sketching is when developers specifies: (1) An abstract workflow plan containing the way information passes through the application's parts. (2) A place-holder describing the functions and operations to be carried out. This description will help finding a list of suited components.
Components discovering is based on 2 steps: First they resolve the place-holder query by searching in the local repository. If a suitable component is found locally than an identifier of this component is returned to the application. Second, If no component was found, a Query session is started on remote sites. A list of ranked components is returned and refined by user specifications.
Application assembling is the binding phase. Data or protocol conversion are often needed due to the heterogeneous input/output between components (string to array of double conversion etc).

GRIDLE is their component search engine: Google like Ranking, Indexing and Discovery service for a Link-based Eco-system of software components. The main modules are the following:

The Component Crawler is like a Web Crawler, it retrieves new components and updates links (bindings) between components and pass the results to the indexer.
The Indexer will build the index data structure of GRIDLE. Characteristics and meta data associated to the component should be carefully selected to be indexed. Actually the meta information associated to components will help retrieve the suited one. Such Meta data can be: (1) Functional information like interfaces (published methods, names, signatures) and runtime environment. (2) Non functional information such as QoS and textual description. (3) Linking information to other components.
The Query Analyzer resolves the queries on index basis, it uses a ranking module to retrieve the most relevant components, the search will be refined by the user.

To this stage, I don't have advanced knowledge in such systems and search engines but I find this approach interesting since the world of component development is emerging.
In the near future, thousands of components will be developed and ready to use. One of the main reasons of the wide adoption of the component based programming model is the ability to reuse already developed components and save time during the development process. A search engine seems to be necessary in order to find and locate suitable components.
Some issues in their approach remains unexplained or not clear such as:

Components will be updated , deleted, added, so how to determine the crawler iteration frequency in order to update the indexing?
The same question appears when dealing with Component binding, since the model is inspired from Web pages, I think that components are more dynamic when it deals with binding with other components. Bindings will dynamically (on runtime) appear/disappear when replacing a component, how to maintain the ranking of a component? What is the frequency of the component ranking algorithm ?
In their approach, first they search locally for a suited component. What if remote sites holds better suited components with higher ranks than those already placed in the local repository? What policy to use in order to keep updating the local repository?
The Crawling module searches for new components, do we need to insert an agent on every repository?
How to manage the heterogeneous aspects between components? COM and CORBA components?
Semantic web and ontology use might simplify the mapping and query even though it is considered to be a disadvantage for the designers of GRIDLE due to the unique usage of a unified taxonomy.

Link to the article
PS: According to the Ranking algorithm, the rank of the page hosting the article increased while the rank of my blog is decreasing, actually I am offering a portion of my page's rank.

Review: An IPv6-Based Identification Scheme

This article presents an IPv6 identification scheme to identify physical objects with
RFID tags. The identification is needed in different fields such as locating objects,
health care monitoring, military operations etc.

Their scheme is based on IPv6 unicast address:

010.Registry_ID[5].Provider_ID[16].0.8].Subscriber_ID[24].0[8].Subnet_ID[16].Interface[48].

(note that the "‘."’ is used as a concatenation operator and X[n] where n indicates the
number of bits used to code the field X).

Register_ID is allocated to organizations responsible for assigning network addresses.
Provider_ID is allocated to Internet service provider. 0[8] future extension.
To identify objects they propose to use the unassigned IPv6 namespace that has the
binary prefix "‘001"’ with two formats:

General ID:
0010.Agency_ID[5].Domain_name[48].0[7].Object_Class[16].Serial_Number[48].

Agency_ID is analogous to registry ID, the agency is responsible for allocating the
identifier. Domain name for company or organization. 0[7] future use. Object class to
identify object types. Serial number ID of an object type.

Pseudo Random ID:
0011.Agency_ID[5].Random_Number[119]

This scheme provides more privacy, it does not reveals the ID of the company and other information. They clearly distinguish between and IPv6 address to locate an object and IPv6 ID
to identify and object. The prefix translation is what they propose to translate between an IPv6 ID (prefix 001) and an IPv6 address prefix (010) which means that the IPv6 ID may be used to
obtain an IPv6 address. Since physical objects are mobile, they propose the following
two methods to track objects:

Name System: (Same ID, multiple addresses)
They propose to use canonical name written in a reverse order in which they are constructed.
They reverse it so it can be used as a URL DNS like. By doing this, they can integrate their scheme in existing systems like DNS.

Serial_number.object-class.company.organization.obj.
DNS query will start at "obj"’ level than it will go from right to left.

When objects moves, we need only to update DNS records which maps a name into an address. When objects moves to a different domain, the new owner of objects should update the DNS record. Since DNS is not suited for updates, a localization service provided by the proxy should handle the update.

Address forward scheme:
They use the home agent approach (see Mobile IP paragraph).
They assume that routers are configured to distinguish identifiers from addresses.
We can search for an object by its ID because routers will translate the IPv6 ID into an IPv6 address by modifying the 3 bits prefix. The ID will remain the same and so the address. Objects of the same owner are assigned to a dedicated proxy. The proxy’s address will have same domain prefix. When a router receives an object ID, it translates it and forwards it to the correspondent proxy according to the domain name. When an object moves, its ID and so its address remains the same. The object updates his home proxy with the new location where it moved recently. It also informs the new proxy about its ID. Routers will forward the packets to the proxy according to the domain name. This proxy will have the same role as a home agent in an IP mobile. The proxy will forward requests to the new proxy where the object has moved.

The approach is very comprehensive specially the mapping between ID and address with the 3 bits prefix and the facility of integration in today’s system without major modifications and without a need to query directories like DNS.
However since objects are usually manufactured in thousands and millions (Gillette raisers) when a container moves from an owner to another. The traffic update will generate a massive overhead between owners.
We are faced to the same problems when dealing with mobile IP. We lost object’s trace during transition from one proxy to another.
Not all companies have the same productions capacities. Small companies manufacturers small number of items which means that small companies will not use the IDs assigned to it, while big companies will exhaust ID in a short time (comparing to small companies). Should small companies share the domain ? This means that proxies of small companies cannot use a routing mask when dealing with IDs and IP addresses.
Two companies can share same domain, these 2 companies will have same prefix and since it cannot apply a mask. A proxy will list all the IDs of owned objects. Proxies will be overloaded then and I/O time query will be slow due to the huge amount of data.
Mobile IP approach what if home proxy or company owner of the domain is closed for economic or other reasons. How to maintain the address forwarding?

Link to the article