Professor Zhou Ligong has recently completed and published his long-awaited book, "Programming and Data Structure," which has been made available for free download to electronic engineers and college students. With Professor Zhou's authorization, the content of this book is now being serialized for wider access. > > > 1.1 Hash Table > > > 1.1.1 Question Imagine you need to design an information management system to handle data for 10,000 students. Each student record includes a student ID, name, gender, height, and weight. The structure of the student record is defined as follows: To store these records, one straightforward approach is to use a contiguous array, such as In such cases, a simple linear search might be used, where each record is checked sequentially until a match is found. While this method works, it becomes very slow as the number of records increases. For instance, searching through 10,000 records could take up to 10,000 comparisons in the worst case. A more efficient solution would involve using a hash table. A hash table allows for fast lookups by converting the student ID into an index using a hash function. This index determines where the record should be stored or retrieved from. However, designing an effective hash function is crucial to ensure even distribution of records across the table. One common challenge with hash tables is handling collisions—when two different keys map to the same index. To manage this, techniques like chaining or open addressing can be used. In the case of chaining, each index in the hash table points to a linked list of records, allowing multiple entries to coexist at the same index. Another alternative is to divide the records into multiple groups and store them in separate arrays. This reduces the number of comparisons needed during a search but requires careful grouping rules to ensure even distribution. As the number of groups increases, the efficiency of both insertion and lookup improves. Ultimately, the choice between different data structures depends on the specific requirements of the system. Hash tables offer fast lookups but require careful implementation, while arrays and linked lists provide simpler but slower alternatives. Understanding the trade-offs between these approaches is essential when designing efficient data management systems. > > > 1.1.2 Types of Hash Tables The hash table type This structure contains essential information about the hash table, including a pointer to the array of linked lists, the size of the array, and the length of the stored records. It also includes a hash function that maps keys (such as student IDs) to indices in the table. For general use, the hash table structure must allow for flexible storage of any data type. Therefore, the structure includes fields for the size of the array, the length of each record, the length of the key, and a pointer to the user-defined hash function. Here is the complete definition of the hash table structure: Before using the hash table functions, you must first define an instance of this structure. If your system requires multiple hash tables, you can create several instances of the
Ungrouped,High Quality Ungrouped,Ungrouped Details, CN Shenzhen Waweis Technology Co., Ltd. , https://www.waweis.comtypedef struct _student {
unsigned char id[6]; // student number (6 bytes)
char name[10]; // name
char sex; // gender
float height, weight; // height, weight
} student_t;
student_t student_db[10000];
. However, finding a specific student by their ID can be inefficient if the IDs are not sequential. For example, student IDs may follow a format like "year + major code + class + serial number," making direct indexing impractical.struct _hash_db
is defined as follows:typedef struct _hash_db hash_db_t;
typedef unsigned int (*hash_func_t)(const void *key);
struct _hash_db {
slist_head_t *p_head; // pointer to the first element of the array
unsigned int size; // number of elements in the array
unsigned int value_len; // length of a record
unsigned int key_len; // length of the key
hash_func_t pfn_hash; // pointer to the hash function
};
hash_db_t
type.