Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#289438 - 07/11/2006 21:40 Am I daft? (Or: Another Pointy-headed SQL/Access Question)
Ezekiel
pooh-bah

Registered: 25/08/2000
Posts: 2413
Loc: NH USA
Ok, so don't answer that first one.

I've been beating my head on what's probably an old chestnut of a database question and I cannot find or figure out the answer. So, I come begging help from the mighty crania of the empegBBS, fount of all knowledge!

The task seems simple enough (when I started it three days ago). I have a source data table - with many records. Each of these records is divided into 'description' and 'data' fields. There are multiple description fields and multiple data fields, some of each type contain NULL values.

Some (many) of the data fields, when considered alone without the description fields, are duplicate groups. There are no duplicate groups of description fields when they are considered alone (there's no two sets of data for any one unique description is another way to put it).

What I'm attempting to do is to create an data index table which lists description values next to a data group index number. That way I can see which groups of descriptions share common data values.

I started off using Joins, but then learned that NULL entries cannot be used for comparisons in a join, so no love there.

Indexes seem to do what I want when considered 'Unique' but not 'Primary'. But how do I substantiate and index value into a concrete field value? So, I'm stuck there.

I suppose I could do this in multiple steps:
Step 0: Use a 'Select Distinct' query to get my data groups into a table with an autonumbering field to get my index numbers.
Step 1: Use an update querty to change all NULL entries to a dummy string (like 'XYXYXYXY')
Step 2: Inner join to get index values in a table via a query.
Step 3: Set all step 1 dummy text back to NULL values using a second update query.

This seems like a big hack and doesn't seem very elegant (although it should work just fine).

I've attached a trivial sample (zipped) Access file. This isn't my real data, but it should be helpful as an example.

Tables:
Raw_Data - this is my source table
Distinct_Data_Table - this is a table I created by running the Distinct_Data_Query.
Desired_Output_Table - this is a hand-constructed version of what I'm aiming for.

The real data set runs about 56,000 records and has 14 data fields and three description fields.

Any help is very much appreciated.

-Zeke


Attachments
290219-TestDataIndexing.zip (34 downloads)

_________________________
WWFSMD?

Top
#289439 - 07/11/2006 22:06 Re: Am I daft? (Or: Another Pointy-headed SQL/Access Question) [Re: Ezekiel]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
I don't have a copy of Access, so I can't see your example, so forgive me if these questions seem stupid.

It sounds like you've got a table that looks like this:

1:A
1:B
1:C
2:D
2:E
3:F
3:G
3:H

and you want to find all the letters that match a particular number, right?

So you just want to know how to construct a query that will give you back a result that shows all of the letters for each number. Basically, in pseudo code:
Code:
array = select unique number from table
for each array_element
select * from table where number = array_element



Or are you wanting a new table that will combine all of the letters into a single field so that the numbers become unique?
_________________________
Bitt Faulk

Top
#289440 - 08/11/2006 02:20 Re: Am I daft? (Or: Another Pointy-headed SQL/Access Question) [Re: wfaulk]
gbeer
carpal tunnel

Registered: 17/12/2000
Posts: 2665
Loc: Manteca, California
Code:
 Original data set
Descr2 Descr1 Data1 Data2 Data3
Alpha 1 2 X
Bravo 3 4 Y
Charlie 1 X
Delta 1 2 X
Sigma 5 9
1 Sigma 5 9 Z
=======================================
Then sorted by data1,data2,data3 and adding an index that increments with changes in the data.

Descr2 Descr1 Data1 Data2 Data3 index
Alpha 1 2 X 1
Delta 1 2 X 1
Charlie 1 X 2
Bravo 3 4 Y 3
1 Sigma 5 9 Z 4
Sigma 5 9 5
========================================
Finally list the description pairs that exhibit common data sets (index values)
1 - |Alpha, |Delta
2 - |Charlie
3 - |Bravo
4 - 1|Sigma
5 - |Sigma




That's how I understand the problem.
_________________________
Glenn

Top
#289441 - 08/11/2006 08:11 Re: Am I daft? (Or: Another Pointy-headed SQL/Access Question) [Re: Ezekiel]
Mach
old hand

Registered: 15/07/2002
Posts: 828
Loc: Texas, USA
Could you concatenate all the data fields together (or substring of each) to make a unique key and then join or create a numeric index on that?

qryUnique:
SELECT [Raw_Data].[Descr2], [Raw_Data].[Descr1], [Raw_Data].[Data1], [Raw_Data].[Data2], [Raw_Data].[Data3], Trim([data1]) & Trim([data2]) & Trim([data3]) AS INDX
FROM Raw_Data;

qryOutput:
SELECT DISTINCT qryUnique.Descr1, qryUnique.Descr2, qryUnique.INDX
FROM qryUnique;

Top
#289442 - 08/11/2006 12:38 Re: Am I daft? (Or: Another Pointy-headed SQL/Access Question) [Re: Mach]
Ezekiel
pooh-bah

Registered: 25/08/2000
Posts: 2413
Loc: NH USA
Bitt,
The null entries in the data screw up the data=index_data since data(NULL)<>index_data(NULL). Your question makes perfect sense as you didn't have the data.

Mach,
Hadn't thought of concatenation. It should work great, but I don't have mathematical certainty that two concatenations wouldn't be equal (for example to entries with adjacent data pairs of (10,.234 = 10.234) and (1,0.234 = 10.234). Highly unlikely with so many data records - I know. However, this is the easiest to code, and I can check by looking at the total number or records of the resulting index table - they should equal the number of records in the source table.

gbeer,
Very cool. In fact I've already got a similar module written to take single year entries into ranges (I have discontinuous year ranges for a given description - so a straight up min/max group-by query didn't work). The code will be a bit longer because of the large quantity of data fields, but it should be mathematically bullet-proof for all data sets.

Thanks guys. I really appreciate the help.

-Zeke
_________________________
WWFSMD?

Top
#289443 - 23/05/2007 18:32 Re: Am I daft? (Or: Another Pointy-headed SQL/Access Question) [Re: Ezekiel]
Ezekiel
pooh-bah

Registered: 25/08/2000
Posts: 2413
Loc: NH USA
Many moons after the first post and I'm still deep in Access. I found the following article over at Tech Republic very useful. As the article title implies it is 10 tricks for handling null values in Microsoft Access.

I thought someone else might also find this useful.

-Zeke
_________________________
WWFSMD?

Top