I want to know if you have seen a particular problem and used a particular data structure, and why? I usually end up using just arrays, lists, or hash maps for most of day to day work. If there is a particular algorithm needs to be implemented then I try to think of trees or linked list. Just wondering how do you decide which one is better? What are the tradeoffs?
–Posted at Google Groups
Data structures should be matched to the problem to be accomplished. What kinds of problems are we trying to solve? Many, perhaps most, problems can be solved with just a few basic data structures. Arrays and hashes are some of the most important of these.
As a general rule, the data structure will affect the runtime of your algorithm, but not the result.
So you can do everything with arrays. The problem is that, if the algorithm involves appending a lot of data to an array, you might have to grow it each time. This can involve reallocations and copies, alternatively making every array huge. On a small problem and a fast machine this might not matter, but otherwise programmers would probably look to linked lists.
The universal properties of data structures are the amount of memory used in storing the contents, and the time and additional memory each operation takes. You come to know those for some important kinds of data structures and look for a fit with the requirements on footprint or responsiveness.
One resulting tradeoff is the usual time vs. space — e.g., hash tables provide fast search but need many more slots than entries. Also, a given storage method may permit several algorithms for some operation that vary along this dimension.
The correct answer is of course that you should pick the data structure depending on the problem. In general, (in complex applications), you can always change the data structures dynamically to prepare the data for a given algorithm.
For any problem that I can think of, the data structures recommend themselves, and in a very obvious manner. It’s important to know what the different data structures do. You don’t even have to know how they do it (but you do have to understand O(f(n)) notation or you won’t know which one is better).
Cost-benefit analysis: What features does the data structure provide, what are costs (time, space, etc.) to using them, and how well does this combination suit the particular algorithm I have in mind?
I think that anyone can memorize a list of (for example) 7 things pretty well. If you can memorize a list of 7 (or whatever) fundamental data structures and what they are used for then you will be able to choose them effectively when they are called for.
Sometimes, your choices won’t be optimal (e.g. you used a heap when you could have used a weak-heap). But the difference will probably be small enough that it won’t matter much anyway. However I program mainly in C. In Perl hashes are built in, so of course it is easy to use them even where, strictly, you don’t need constant access time.
Intertwined but in a slightly different vein, another tradeoff is among the operations on a given ADT (e.g. Union-Find) — different underlying storage methods might make one operation or another fast but not both at once! If you know something about the corresponding object’s usage pattern, look into that data structure which makes the pattern fast. But of course, this choice generally affects total storage used as well.
If (on the other hand) you use an unordered array when you should have used a hash table and there are millions of objects in it, the project will fail (until/unless someone repairs it).
Undergrad computer science courses do talk about simple data structures, but they don’t really do a good job of explaining that those data structures are what you should use most of the time, and they tend to spend a good deal of time discussing more advanced data structures (e.g. red-black trees, B-trees) that usually aren’t what you want.
Find a copy of “C++ Data Structures” / “Java Data Structures “etc book and try reading. Couldn’t hurt. Also, see “Effective STL” for recommendations on how to choose standard containers.
Algorithms are usually written around some kind of collection, and different collection can fit the bill. The performance requirements are usually the deciding factor when more than one data structure is capable of just doing the grunt work. See the “Effective STL” and other books in its bibliography.
Which Data Structure to Use? Type of Data Structure Selection in Programming !
Computer Consulting Services: Protecting Client Data
As a provider of computer consulting services, you should communicate to your clients that a dedicated server generally can enforce strong user passwords. This means your clients can:
[…] hash maps for most of day to … It??s important to know what the different data structures do. …http://www.sharpprogrammer.com/data-structures/which-data-structure-to-use-data-structure-selection-…Course designatorCIS 330 programming Language II 3 Credit Hours … information, creating and using […]
RSS feed for comments on this post · TrackBack URI
You must be logged in to post a comment.
tzury bar yochay said,
September 8, 2007 @ 11:26 pmregarding your comment:
I just entered into Python world. I speak Perl. Once I grab up enough syntax, I’ll look at the framework. Thanks
I think you should take a look at web.py — http://webpy.org You are most likely to fall in love with it.
Tzury Bar Yochay tzury dot by at gmail dot com