Penser en Java - 9) Stockage des objets

9) Stockage des objets

Texte original

Traducteur : Jérome Quelin

///

Ce chapitre contient 14 pages
1 2 3 4 5 6 7 8 9 10 11 12 13 14

\\\

In main( ), each time a random number is generated it is wrapped inside an Integer object so that reference can be used with the HashMap. (You can’t use a primitive with a container, only an object reference.) The containsKey( ) method checks to see if this key is already in the container. (That is, has the number been found already?) If so, the get( ) method produces the associated value for the key, which in this case is a Counter object. The value i inside the counter is incremented to indicate that one more of this particular random number has been found.

Dans main(), chaque fois qu'un nombre aléatoire est généré, il est encapsulé dans un objet Integer dont la référence sera utilisée par le HashMap (on ne peut stocker de scalaire dans un conteneur, uniquement une référence sur un objet). La méthode containsKey() vérifie si la clef existe dans le conteneur (c'est à dire, est-ce que le nombre a déjà été généré auparavant ?). Si c'est le cas, la méthode get() renvoie la valeur associée à cette clef, un objet Counter dans notre cas. La valeur i à l'intérieur du compteur est incrémentée pour indiquer qu'une occurence de plus de ce nombre aléatoire particulier a été généré.

If the key has not been found yet, the method put( ) will place a new key-value pair into the HashMap. Since Counter automatically initializes its variable i to one when it’s created, it indicates the first occurrence of this particular random number.

Si la clef n'a pas déjà été stockée, la méthode put() insèrera une nouvelle paire clef - valeur dans le HashMap. Puisque Counter initialise automatiquement sa variable i à 1 lorsqu'elle est créée, cela indique une première occurence de ce nombre aléatoire particulier.

To display the HashMap, it is simply printed. The HashMap toString( ) method moves through all the key-value pairs and calls the toString( ) for each one. The Integer.toString( ) is predefined, and you can see the toString( ) for Counter. The output from one run (with some line breaks added) is:

Pour afficher le HashMap, il est simplement imprimé. La méthode toString() de HashMap parcourt toutes les paires clef - valeur et appelle toString() pour chacune d'entre elles. La méthode Integer.toString() est prédéfinie, et on peut voir la méthode toString() de la classe Counter. La sortie du programme (après l'insertion de quelques retours chariot) ressemble à :

{19=526, 18=533, 17=460, 16=513, 15=521, 14=495, 13=512, 12=483, 11=488, 10=487, 9=514, 8=523, 7=497, 6=487, 5=480, 4=489, 3=509, 2=503, 1=475, 0=505}

{19=526, 18=533, 17=460, 16=513, 15=521, 14=495,
13=512, 12=483, 11=488, 10=487, 9=514, 8=523,
7=497, 6=487, 5=480, 4=489, 3=509, 2=503, 1=475,
0=505}

You might wonder at the necessity of the class Counter, which seems like it doesn’t even have the functionality of the wrapper class Integer. Why not use int or Integer? Well, you can’t use an int because all of the containers can hold only Object references. After seeing containers the wrapper classes might begin to make a little more sense to you, since you can’t put any of the primitive types in containers. However, the only thing you can do with the Java wrappers is to initialize them to a particular value and read that value. That is, there’s no way to change a value once a wrapper object has been created. This makes the Integer wrapper immediately useless to solve our problem, so we’re forced to create a new class that does satisfy the need.

On peut s'interroger sur la nécessité d'une classe Counter, qui ne semble même pas avoir les fonctionnalités de la classe d'encapsulation Integer. Pourquoi ne pas utiliser un int ou un Integer ? On ne peut utiliser un int puisque les conteneurs ne peuvent stocker que des références d'Object. On pourrait alors être tenté de se tourner vers les classes Java d'encapsulation des types primitifs. Cependant, ces classes ne permettent que de stocker une valeur initiale et de lire cette valeur. C'est à dire qu'il n'existe aucun moyen de changer cette valeur une fois qu'un objet d'encapsulation a été créé. Cela rend la classe Integer inutile pour résoudre notre problème, et nous force à créer une nouvelle classe qui statisfait l'ensemble de nos besoins.

SortedMap

Maps triées : les SortedMaps

If you have a SortedMap (of which TreeMap is the only one available), the keys are guaranteed to be in sorted order which allows additional functionality to be provided with these methods in the SortedMap interface:

Une SortedMap (dont TreeMap est l'unique représentant) garantit que ses éléments seront stockés triés selon leur clef, ce qui permet de proposer de nouvelles fonctionnalités grâce aux méthodes supplémentaires de l'interface SortedMap suivantes :

Comparator comparator(): Produces the comparator used for this Map, or null for natural ordering.

Comparator comparator() : Renvoie le Comparator utilisé pour cette Map, ou null dans le cas d'un tri naturel.

Object firstKey(): Produces the lowest key.

Object firstKey() : Renvoie la plus petite clef.

Object lastKey(): Produces the highest key.

Object lastKey() : Renvoie la plus grande clef.

SortedMap subMap(fromKey, toKey): Produces a view of this Map with keys from fromKey, inclusive, to toKey, exclusive.

SortedMap subMap(fromKey, toKey) : Renvoie une vue de la Map contenant les paires dont les clefs vont de fromKey inclus à toKey exclu.

SortedMap headMap(toKey): Produces a view of this Map with keys less than toKey.

SortedMap headMap(toKey) : Renvoie une vue de la Map contenant les paires dont la clef est inférieure à toKey.

SortedMap tailMap(fromKey): Produces a view of this Map with keys greater than or equal to fromKey.

SortedMap tailMap(fromKey) : Renvoie une vue de la Map contenant les paires dont la clef est supérieure ou égale à fromKey.

Hashing and hash codes

Hachage et codes de hachage

In the previous example, a standard library class (Integer) was used as a key for the HashMap. It worked fine as a key, because it has all the necessary wiring to make it work correctly as a key. But a common pitfall occurs with HashMaps when you create your own classes to be used as keys. For example, consider a weather predicting system that matches Groundhog objects to Prediction objects. It seems fairly straightforward—you create the two classes, and use Groundhog as the key and Prediction as the value:

Dans l'exemple précédent, une classe de la bibliothèque standard (Integer) était utilisée comme clef pour le HashMap. Cela ne pose pas de problèmes car elle dispose de tout ce qu'il faut pour fonctionner correctement comme une clef. Mais il existe un point d'achoppement classique avec les HashMaps lorsqu'on crée une classe destinée à être utilisée comme clef. Considérons par exemple un système de prévision météorologique qui associe des objets Groundhog à des objets Prediction. Cela semble relativement simple : il suffit de créer deux classes, et d'utiliser Groundhog comme clef et Prediction comme valeur :

//: c09:SpringDetector.java // Looks plausible, but doesn't work. import java.util.*; class Groundhog { int ghNumber; Groundhog(int n) { ghNumber = n; } } class Prediction { boolean shadow = Math.random() > 0.5; public String toString() { if(shadow) return "Six more weeks of Winter!"; else return "Early Spring!"; } } public class SpringDetector { public static void main(String[] args) { HashMap hm = new HashMap(); for(int i = 0; i < 10; i++) hm.put(new Groundhog(i), new Prediction()); System.out.println("hm = " + hm + "\n"); System.out.println( "Looking up prediction for Groundhog #3:"); Groundhog gh = new Groundhog(3); if(hm.containsKey(gh)) System.out.println((Prediction)hm.get(gh)); else System.out.println("Key not found: " + gh); } } ///:~

//: c09:SpringDetector.java
// Semble plausible, mais ne fonctionne pas.
import java.util.*;

class Groundhog {
  int ghNumber;
  Groundhog(int n) { ghNumber = n; }
}

class Prediction {
  boolean shadow = Math.random() > 0.5;
  public String toString() {
    if(shadow)
      return "Six more weeks of Winter!";
    else
      return "Early Spring!";
  }
}

public class SpringDetector {
  public static void main(String[] args) {
    HashMap hm = new HashMap();
    for(int i = 0; i < 10; i++)
      hm.put(new Groundhog(i), new Prediction());
    System.out.println("hm = " + hm + "\n");
    System.out.println(
      "Looking up prediction for Groundhog #3:");
    Groundhog gh = new Groundhog(3);
    if(hm.containsKey(gh))
      System.out.println((Prediction)hm.get(gh));
    else
      System.out.println("Key not found: " + gh);
  }
} ///:~

Each Groundhog is given an identity number, so you can look up a Prediction in the HashMap by saying, “Give me the Prediction associated with Groundhog number 3.” The Prediction class contains a boolean that is initialized using Math.random( ), and a toString( ) that interprets the result for you. In main( ), a HashMap is filled with Groundhogs and their associated Predictions. The HashMap is printed so you can see that it has been filled. Then a Groundhog with an identity number of 3 is used as a key to look up the prediction for Groundhog #3 (which you can see must be in the Map).

Chaque Groundhog se voit attribuer un numéro d'identité, afin de pouvoir récupérer une Prediction dans le HashMap en disant « Donne-moi la Prediction associée au Groundhog numéro 3 ». La classe Prediction contient un boolean initialisé en utilisant Math.random(), et une méthode toString() pour en interpréter la valeur. Dans main(), un HashMap est rempli avec des Groundhogs et leurs Predictions associées. Le HashMap est affiché afin de voir qu'il a été correctement rempli. Un Groundhog avec une identité de 3 est alors utilisé comme clef pour extraire la prédiction du Groundhog numéro 3 (qui doit être dans le HashMap).

It seems simple enough, but it doesn’t work. The problem is that Groundhog is inherited from the common root class Object (which is what happens if you don’t specify a base class, thus all classes are ultimately inherited from Object). It is Object’s hashCode( ) method that is used to generate the hash code for each object, and by default it just uses the address of its object. Thus, the first instance of Groundhog(3) does not produce a hash code equal to the hash code for the second instance of Groundhog(3) that we tried to use as a lookup.

Tout ceci semble très simple, mais ne marche pas. Le problème vient du fait que Groundhog hérite de la classe de base Object (ce qui est le comportement par défaut si aucune superclasse n'est précisée ; toutes les classes dérivent donc en fin de compte de la classe Object). C'est donc la méthode hashCode() de Object qui est utilisée pour générer le code de hachage pour chaque objet, et par défaut, celle-ci renvoie juste l'adresse de cet objet. La première instance de Groundhog(3) ne renvoie donc pas le même code de hachage que celui de la seconde instance de Groundhog(3) que nous avons tenté d'utiliser comme clef d'extraction.

You might think that all you need to do is write an appropriate override for hashCode( ). But it still won’t work until you’ve done one more thing: override the equals( ) that is also part of Object. This method is used by the HashMap when trying to determine if your key is equal to any of the keys in the table. Again, the default Object.equals( ) simply compares object addresses, so one Groundhog(3) is not equal to another Groundhog(3).

On pourrait penser qu'il suffit de redéfinir hashCode(). Mais ceci ne fonctionnera toujours pas tant qu'on n'aura pas aussi redéfini la méthode equals() qui fait aussi partie de la classe Object. Cette méthode est utilisée par le HashMap lorsqu'il essaie de déterminer si la clef est égale à l'une des autres clefs de la table. Et la méthode par défaut Object.equals() compare simplement les adresses des objets, ce qui fait qu'un objet Groundhog(3) est différent d'un autre Groundhog(3).

Thus, to use your own classes as keys in a HashMap, you must override both hashCode( ) and equals( ), as shown in the following solution to the problem above:

Pour utiliser un nouveau type comme clef dans un HashMap, il faut donc redéfinir les deux méthodes hashCode() et equals(), comme le montre la solution suivante :

//: c09:SpringDetector2.java // A class that's used as a key in a HashMap // must override hashCode() and equals(). import java.util.*; class Groundhog2 { int ghNumber; Groundhog2(int n) { ghNumber = n; } public int hashCode() { return ghNumber; } public boolean equals(Object o) { return (o instanceof Groundhog2) && (ghNumber == ((Groundhog2)o).ghNumber); } } public class SpringDetector2 { public static void main(String[] args) { HashMap hm = new HashMap(); for(int i = 0; i < 10; i++) hm.put(new Groundhog2(i),new Prediction()); System.out.println("hm = " + hm + "\n"); System.out.println( "Looking up prediction for groundhog #3:"); Groundhog2 gh = new Groundhog2(3); if(hm.containsKey(gh)) System.out.println((Prediction)hm.get(gh)); } } ///:~

//: c09:SpringDetector2.java
// Une classe utilisée comme clef dans un HashMap
// doit redéfinir hashCode() et equals().
import java.util.*;

class Groundhog2 {
  int ghNumber;
  Groundhog2(int n) { ghNumber = n; }
  public int hashCode() { return ghNumber; }
  public boolean equals(Object o) {
    return (o instanceof Groundhog2)
      && (ghNumber == ((Groundhog2)o).ghNumber);
  }
}

public class SpringDetector2 {
  public static void main(String[] args) {
    HashMap hm = new HashMap();
    for(int i = 0; i < 10; i++)
      hm.put(new Groundhog2(i),new Prediction());
    System.out.println("hm = " + hm + "\n");
    System.out.println(
      "Looking up prediction for groundhog #3:");
    Groundhog2 gh = new Groundhog2(3);
    if(hm.containsKey(gh))
      System.out.println((Prediction)hm.get(gh));
  }
} ///:~

Note that this uses the Prediction class from the previous example, so SpringDetector.java must be compiled first or you’ll get a compile-time error when you try to compile SpringDetector2.java.

Notez que cet exemple utilise la classe Prediction de l'exemple précédent, donc SpringDetector.java doit déjà avoir été compilé ou vous aurez une erreur lorsque vous tenterez de compiler SpringDetector2.java.

Groundhog2.hashCode( ) returns the groundhog number as an identifier. In this example, the programmer is responsible for ensuring that no two groundhogs exist with the same ID number. The hashCode( ) is not required to return a unique identifier (something you’ll understand better later in this chapter), but the equals( ) method must be able to strictly determine whether two objects are equivalent.

Groundhog2.hashCode() renvoie le numéro de marmotte comme identifiant. Dans cet exemple, le programmeur doit s'assurer que deux marmottes ne portent pas le même identifiant. La méthode hashCode() n'est pas obligée de renvoyer un identifiant unique (vous comprendrez mieux ceci plus tard dans ce chapitre), mais la méthode equals() doit être capable de déterminer si deux objets sont strictement équivalents.

Even though it appears that the equals( ) method is only checking to see whether the argument is an instance of Groundhog2 (using the instanceof keyword, which is fully explained in Chapter 12), the instanceof actually quietly does a second sanity check, to see if the object is null, since instanceof produces false if the left-hand argument is null. Assuming it’s the correct type and not null, the comparison is based on the actual ghNumbers. This time, when you run the program, you’ll see it produces the correct output.

Bien que la méthode equals() semble ne vérifier que si l'argument est bien une instance de Groundhog2 (en utilisant le mot-clef instanceof, expliqué plus en détails dans le Chapitre 12), instanceof effectue en fait implicitement un deuxième contrôle puisqu'il renvoie false si l'argument de gauche est null. En assumant que le contrôle de type s'est bien passé, la comparaison est basée sur les ghNumbers des instances. Et cette fois, lorsqu'on lance le programme, on peut voir qu'il produit le résultat attendu.

When creating your own class to use in a HashSet, you must pay attention to the same issues as when it is used as a key in a HashMap.

On rencontre les mêmes problèmes quand on crée un nouveau type destiné à être stocké dans un HashSet ou utilisé comme clef dans un HashMap.

Understanding hashCode( )

Comprendre hashCode()

The above example is only a start toward solving the problem correctly. It shows that if you do not override hashCode( ) and equals( ) for your key, the hashed data structure (HashSet or HashMap) will not be able to deal with your key properly. However, to get a good solution for the problem you need to understand what’s going on inside the hashed data structure.

L'exemple précédent n'est que le début de la solution complète et correcte de ce problème. Il montre que la structure de données hachée (HashSet ou HashMap) ne sera pas capable de gérer correctement les objets clef si on ne redéfinit pas les méthodes hashCode() et equals() pour ces objets. Cependant, pour fournir une solution propre au problème, il faut comprendre ce qui se passe derrière la structure de données hachée.

First, consider the motivation behind hashing: you want to look up an object using another object. But you can accomplish this with a TreeSet or TreeMap, too. It’s also possible to implement your own Map. To do so, the Map.entrySet( ) method must be supplied to produce a set of Map.Entry objects. MPair will be defined as the new type of Map.Entry. In order for it to be placed in a TreeSet it must implement equals( ) and be Comparable:

Pour cela, il faut tout d'abord comprendre le pourquoi du hachage : on veut extraire un objet associé à un autre objet. Mais il est aussi possible d'accomplir ceci avec un TreeSet ou un TreeMap. Il est même possible d'implémenter sa propre Map. Pour cela, il nous faut fournir une méthode Map.entrySet() qui renvoie un ensemble d'objets Map.Entry. MPair sera définie comme le nouveau type de Map.Entry. Afin qu'il puisse être placé dans un TreeSet il doit implémenter equals() et être Comparable :

//: c09:MPair.java // A Map implemented with ArrayLists. import java.util.*; public class MPair implements Map.Entry, Comparable { Object key, value; MPair(Object k, Object v) { key = k; value = v; } public Object getKey() { return key; } public Object getValue() { return value; } public Object setValue(Object v){ Object result = value; value = v; return result; } public boolean equals(Object o) { return key.equals(((MPair)o).key); } public int compareTo(Object rv) { return ((Comparable)key).compareTo( ((MPair)rv).key); } } ///:~

//: c09:MPair.java
// Une Map implémentée avec des ArrayLists.
import java.util.*;

public class MPair
implements Map.Entry, Comparable {
  Object key, value;
  MPair(Object k, Object v) {
    key = k;
    value = v;
  }
  public Object getKey() { return key; }
  public Object getValue() { return value; }
  public Object setValue(Object v){
    Object result = value;
    value = v;
    return result;
  }
  public boolean equals(Object o) {
    return key.equals(((MPair)o).key);
  }
  public int compareTo(Object rv) {
    return ((Comparable)key).compareTo(
      ((MPair)rv).key);
  }
} ///:~

Notice that the comparisons are only interested in the keys, so duplicate values are perfectly acceptable.

Notez que les comparaisons ne s'effectuent que sur les clefs, les valeurs dupliquées sont donc parfaitement légales.

The following example implements a Map using a pair of ArrayLists:

L'exemple suivant impémente une Map en utilisant une paire d'ArrayLists :

//: c09:SlowMap.java // A Map implemented with ArrayLists. import java.util.*; import com.bruceeckel.util.*; public class SlowMap extends AbstractMap { private ArrayList keys = new ArrayList(), values = new ArrayList(); public Object put(Object key, Object value) { Object result = get(key); if(!keys.contains(key)) { keys.add(key); values.add(value); } else values.set(keys.indexOf(key), value); return result; } public Object get(Object key) { if(!keys.contains(key)) return null; return values.get(keys.indexOf(key)); } public Set entrySet() { Set entries = new HashSet(); Iterator ki = keys.iterator(), vi = values.iterator(); while(ki.hasNext()) entries.add(new MPair(ki.next(), vi.next())); return entries; } public static void main(String[] args) { SlowMap m = new SlowMap(); Collections2.fill(m, Collections2.geography, 25); System.out.println(m); } } ///:~

//: c09:SlowMap.java
// Une Map implémentée avec des ArrayLists.
import java.util.*;
import com.bruceeckel.util.*;

public class SlowMap extends AbstractMap {
  private ArrayList
    keys = new ArrayList(),
    values = new ArrayList();
  public Object put(Object key, Object value) {
    Object result = get(key);
    if(!keys.contains(key)) {
      keys.add(key);
      values.add(value);
    } else
      values.set(keys.indexOf(key), value);
    return result;
  }
  public Object get(Object key) {
    if(!keys.contains(key))
      return null;
    return values.get(keys.indexOf(key));
  }
  public Set entrySet() {
    Set entries = new HashSet();
    Iterator
      ki = keys.iterator(),
      vi = values.iterator();
    while(ki.hasNext())
      entries.add(new MPair(ki.next(), vi.next()));
    return entries;
  }
  public static void main(String[] args) {
    SlowMap m = new SlowMap();
    Collections2.fill(m,
      Collections2.geography, 25);
    System.out.println(m);
  }
} ///:~

The put( ) method simply places the keys and values in corresponding ArrayLists. In main( ), a SlowMap is loaded and then printed to show that it works.

La méthode put() stocke simplement les clefs et les valeurs dans les ArrayLists correspondantes. Dans main(), une SlowMap est remplie et imprimée pour montrer qu'elle fonctionne.

This shows that it’s not that hard to produce a new type of Map. But as the name suggests, a SlowMap isn’t very fast, so you probably wouldn’t use it if you had an alternative available. The problem is in the lookup of the key: there is no order so a simple linear search is used, which is the slowest way to look something up.

Ceci montre qu'il n'est pas difficile de produire un nouveau type de Map. Mais comme son nom le suggère, une SlowMap n'est pas très rapide, et on ne l'utilisera probablement pas si on dispose d'une autre alternative. Le problème se trouve dans la recherche de la clef : comme elles sont stockées sans aucun ordre, une recherche linéaire est effectuée, ce qui constitue la manière la plus lente de rechercher un item particulier.

The whole point of hashing is speed: hashing allows the lookup to happen quickly. Since the bottleneck is in the speed of the key lookup, one of the solutions to the problem could be by keeping the keys sorted and then using Collections.binarySearch( ) to perform the lookup (an exercise at the end of this chapter will walk you through this process).

Tout l'intérêt du hachage réside dans la vitesse : le hachage permet d'effectuer la recherche rapidement. Puisque le goulot d'étranglement est la recherche de la clef, une des solution du problème serait de garder les clefs triées et d'utiliser ensuite Collections.binarySearch() pour réaliser la recherche (un exercice à la fin du chapitre vous mènera le long de ce processus).

Hashing goes further by saying that all you want to do is to store the key somewhere so that it can be quickly found. As you’ve seen in this chapter, the fastest structure in which to store a group of elements is an array, so that will be used for representing the key information (note carefully that I said “key information,” and not the key itself). Also seen in this chapter was the fact that an array, once allocated, cannot be resized, so we have a problem: we want to be able to store any number of values in the Map, but if the number of keys is fixed by the array size, how can this be?

Le hachage va encore plus loin en spécifiant que tout ce qu'on a besoin de faire est de stocker la clef quelque part afin de pouvoir la retrouver rapidement. Comme on l'a déjà vu dans ce chapitre, la structure la plus efficace pour stocker un ensemble d'éléments est un tableau, c'est donc ce que nous utiliserons pour stocker les informations des clefs (notez bien que j'ai dit : « information des clefs » et non les clefs elles-mêmes). Nous avons aussi vu dans ce chapitre qu'un tableau, une fois alloué, ne peut être redimensionné, nous nous heurtons donc à un autre problème : nous voulons être capable de stocker un nombre quelconque de valeurs dans la Map, mais comment cela est-ce possible si le nombre de clefs est fixé par la taille du tableau ?

The answer is that the array will not hold the keys. From the key object, a number will be derived that will index into the array. This number is the hash code, produced by the hashCode( ) method (in computer science parlance, this is the hash function) defined in Object and presumably overridden by your class. To solve the problem of the fixed-size array, more than one key may produce the same index. That is, there may be collisions. Because of this, it doesn’t matter how big the array is because each key object will land somewhere in that array.

Tout simplement, le tableau n'est pas destiné à stocker les clefs. Un nombre dérivé de l'objet clef servira d'index dans le tableau. Ce nombre est le code de hachage, renvoyé par la méthode hashCode() (dans le jargon informatique, on parle de fonction de hachage) définie dans la classe Object et éventuellement redéfinie dans un nouveau type. Pour résoudre le problème du tableau de taille fixe, plus d'une clef peut produire le même index ; autrement dit, les collisions sont autorisées. Et de ce fait, la taille du tableau importe peu puisque chaque objet clef atterira quelque part dans ce tableau.

So the process of looking up a value starts by computing the hash code and using it to index into the array. If you could guarantee that there were no collisions (which could be possible if you have a fixed number of values) then you’d have a perfect hashing function, but that’s a special case. In all other cases, collisions are handled by external chaining: the array points not directly to a value, but instead to a list of values. These values are searched in a linear fashion using the equals( ) method. Of course, this aspect of the search is much slower, but if the hash function is good there will only be a few values in each slot, at the most. So instead of searching through the entire list, you quickly jump to a slot where you have to compare a few entries to find the value. This is much faster, which is why the HashMap is so quick.

Le processus de recherche d'une valeur débute donc par le calcul du code de hachage, qu'on utilise pour indexer le tableau. Si on peut garantir qu'il n'y a pas eu de collisions (ce qui est possible si on a un nombre fixé de valeurs), alors on dispose d'une fonction de hachage parfaite, mais il s'agit d'un cas spécial. Dans les autres cas, les collisions sont gérées par un chaînage externe : le tableau ne pointe pas directement sur une valeur, mais sur une liste de valeurs. Ces valeurs sont alors parcourues de façon linéaire en utilisant la méthode equals(). Bien sûr, cet aspect de la recherche est plus lent, mais si la fonction de hachage est correctement écrite, il n'y aura que quelques valeurs au plus dans chaque emplacement. Et donc au lieu de parcourir toute la liste pour trouver une valeur, on saute directement dans une cellule où seules quelques entrées devront être comparées pour trouver la valeur. Cette aproche est bien plus efficace, ce qui explique pourquoi un HashMap est si rapide.

Knowing the basics of hashing, it’s possible to implement a simple hashed Map:

Connaissant les bases du hachage, il est possible d'implémenter une Map simple hachée :

//: c09:SimpleHashMap.java // A demonstration hashed Map. import java.util.*; import com.bruceeckel.util.*; public class SimpleHashMap extends AbstractMap { // Choose a prime number for the hash table // size, to achieve a uniform distribution: private final static int SZ = 997; private LinkedList[] bucket= new LinkedList[SZ]; public Object put(Object key, Object value) { Object result = null; int index = key.hashCode() % SZ; if(index < 0) index = -index; if(bucket[index] == null) bucket[index] = new LinkedList(); LinkedList pairs = bucket[index]; MPair pair = new MPair(key, value); ListIterator it = pairs.listIterator(); boolean found = false; while(it.hasNext()) { Object iPair = it.next(); if(iPair.equals(pair)) { result = ((MPair)iPair).getValue(); it.set(pair); // Replace old with new found = true; break; } } if(!found) bucket[index].add(pair); return result; } public Object get(Object key) { int index = key.hashCode() % SZ; if(index < 0) index = -index; if(bucket[index] == null) return null; LinkedList pairs = bucket[index]; MPair match = new MPair(key, null); ListIterator it = pairs.listIterator(); while(it.hasNext()) { Object iPair = it.next(); if(iPair.equals(match)) return ((MPair)iPair).getValue(); } return null; } public Set entrySet() { Set entries = new HashSet(); for(int i = 0; i < bucket.length; i++) { if(bucket[i] == null) continue; Iterator it = bucket[i].iterator(); while(it.hasNext()) entries.add(it.next()); } return entries; } public static void main(String[] args) { SimpleHashMap m = new SimpleHashMap(); Collections2.fill(m, Collections2.geography, 25); System.out.println(m); } } ///:~

//: c09:SimpleHashMap.java
// Démonstration d'une Map hachée.
import java.util.*;
import com.bruceeckel.util.*;

public class SimpleHashMap extends AbstractMap {
  // Choisir un nombre premier pour la taille de la table
  // de hachage, afin d'obtenir une distribution uniforme :
  private final static int SZ = 997;
  private LinkedList[] bucket= new LinkedList[SZ];
  public Object put(Object key, Object value) {
    Object result = null;
    int index = key.hashCode() % SZ;
    if(index < 0) index = -index;
    if(bucket[index] == null)
      bucket[index] = new LinkedList();
    LinkedList pairs = bucket[index];
    MPair pair = new MPair(key, value);
    ListIterator it = pairs.listIterator();
    boolean found = false;
    while(it.hasNext()) {
      Object iPair = it.next();
      if(iPair.equals(pair)) {
        result = ((MPair)iPair).getValue();
        it.set(pair); // Remplace l'ancien par le nouveau
        found = true;
        break;
      }
    }
    if(!found)
      bucket[index].add(pair);
    return result;
  }
  public Object get(Object key) {
    int index = key.hashCode() % SZ;
    if(index < 0) index = -index;
    if(bucket[index] == null) return null;
    LinkedList pairs = bucket[index];
    MPair match = new MPair(key, null);
    ListIterator it = pairs.listIterator();
    while(it.hasNext()) {
      Object iPair = it.next();
      if(iPair.equals(match))
        return ((MPair)iPair).getValue();
    }
    return null;
  }
  public Set entrySet() {
    Set entries = new HashSet();
    for(int i = 0; i < bucket.length; i++) {
      if(bucket[i] == null) continue;
      Iterator it = bucket[i].iterator();
      while(it.hasNext())
        entries.add(it.next());
    }
    return entries;
  }
  public static void main(String[] args) {
    SimpleHashMap m = new SimpleHashMap();
    Collections2.fill(m,
      Collections2.geography, 25);
    System.out.println(m);
  }
} ///:~

Because the “slots” in a hash table are often referred to as buckets, the array that represents the actual table is called bucket. To promote even distribution, the number of buckets is typically a prime number. Notice that it is an array of LinkedList, which automatically provides for collisions—each new item is simply added to the end of the list.

Comme on appelle souvent seaux les « emplacements » d'une table de hachage, le tableau représentant la table est appelé bucket. Pour s'assurer d'une distribution la plus régulière possible, le nombre de seaux est typiquement un nombre premier. Notez qu'il s'agit d'un tableau de LinkedLists, qui permet de gérer automatiquement les collisions - chaque nouvel item est simplement ajouté à la fin de la liste.

The return value of put( ) is null or, if the key was already in the list, the old value associated with that key. The return value is result, which is initialized to null, but if a key is discovered in the list then result is assigned to that key.

La valeur de retour de put() est null ou l'ancienne valeur associée à la clef si la celle-ci était présente dans la liste. La valeur de retour est result, qui est initialisée à null, mais se voit assigner une clef si celle-ci est découverte dans la liste.

For both put( ) and get( ), the first thing that happens is that the hashCode( ) is called for the key, and the result is forced to a positive number. Then it is forced to fit into the bucket array using the modulus operator and the size of the array. If that location is null, it means there are no elements that hash to that location, so a new LinkedList is created to hold the object that just did. However, the normal process is to look through the list to see if there are duplicates, and if there are, the old value is put into result and the new value replaces the old. The found flag keeps track of whether an old key-value pair was found and, if not, the new pair is appended to the end of the list.

Les deux méthodes put() et get() commencent par appeler la méthode hashCode() de l'objet clef, dont le résultat est forcé à un nombre positif. Il est alors forcé dans la plage du tableau via l'opérateur modulo et la taille du tableau. Si l'emplacement est null, cela veut dire qu'aucun élément ne hache à cette localisation, et donc une nouvelle LinkedList est créée pour contenir l'objet qui vient de le faire. Sinon, le processus normal est de parcourir la liste pour voir s'il existe un doublon, et si c'est le cas, l'ancienne valeur est stockée dans result et la nouvelle valeur remplace l'ancienne. Le flag found permet de savoir si une ancienne paire clef - valeur a été trouvée, et dans le cas contraire, une nouvelle paire est ajoutée à la fin de la liste.

In get( ), you’ll see very similar code as that contained in put( ), but simpler. The index is calculated into the bucket array, and if a LinkedList exists it is searched for a match.

Le code de get() est similaire à celui de put(), en plus simple. L'index dans le tableau bucket est calculé, et si une LinkedList existe elle est parcourue pour trouver une concordance.

entrySet( ) must find and traverse all the lists, adding them to the result Set. Once this method has been created, the Map can be tested by filling it with values and then printing them.

entrySet() doit trouver et parcourir toutes les listes, ajoutant tous les éléments dans le Set résultat. Une fois cette méthode fournie, la Map peut être testée en la remplissant avec des valeurs et en les imprimant.

\\\

Haut de page

///