Sunday, February 19, 2012

Doing Summation on multiple criterias on the same column in a single

Hi all
I need to perform a summation on a column of a table based on a
criteria given in another column in the same table. The catch is i
need to perform different sums according to the number of criterias in
the criteria column, in a single query. the structure of the table is
somethinmg like this (only relevant columns are shown)
TABLE1
Value - numeric(20,6)
Month - int
indicator - bit
Now i need to do something like:
SELECT Month, SUM(Value) FROM TABLE1
WHERE indicator = 1
GROUP BY Month
and also
SELECT Month, SUM(Value) FROM TABLE1
WHERE indicator = 0
GROUP BY Month
How can i do this in a single query, something like this:
SELECT Month, SUM(Value where indicator=1), SUM(Value where
indicator=0) and so on ......
Could any body please help me on this ?You need to stop using reserved words (MONTH, INDICATOR, etc) for
column names:
SELECT sales_month,
SUM(CASE WHEN flag = 0 THEN vague_value ELSE NULL END) AS
foo_tot,
SUM(CASE WHEN flag = 1 THEN vague_value ELSE NULL END) AS
bar_tot,
FROM Foobar;
It is also a bad idea to use proprietary BIT data types to fake
assembly language style programming. SQL is a predicate language;
that is, we discover a fact with a predicate rather than set a flag.|||Try:
SELECT
Month
, SUM(CASE WHEN indicator = 0 THEN Value ELSE 0 END) Value0
, SUM(CASE WHEN indicator = 1 THEN Value ELSE 0 END) Value1
FROM TABLE1
GROUP BY Month
Tom
----
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
SQL Server MVP
Toronto, ON Canada
https://mvp.support.microsoft.com/profile/Tom.Moreau
"Darsin" <darsin@.gmail.com> wrote in message
news:8fc9e52e-1606-44d9-a722-4ae0f29542b6@.s13g2000prd.googlegroups.com...
Hi all
I need to perform a summation on a column of a table based on a
criteria given in another column in the same table. The catch is i
need to perform different sums according to the number of criterias in
the criteria column, in a single query. the structure of the table is
somethinmg like this (only relevant columns are shown)
TABLE1
Value - numeric(20,6)
Month - int
indicator - bit
Now i need to do something like:
SELECT Month, SUM(Value) FROM TABLE1
WHERE indicator = 1
GROUP BY Month
and also
SELECT Month, SUM(Value) FROM TABLE1
WHERE indicator = 0
GROUP BY Month
How can i do this in a single query, something like this:
SELECT Month, SUM(Value where indicator=1), SUM(Value where
indicator=0) and so on ......
Could any body please help me on this ?|||Hi
SUM(CASE WHEN indicator =0 THEN value END ) as val1
SUM(CASE WHEN indicator =1 THEN value END ) as val1
FROM tbl
"Darsin" <darsin@.gmail.com> wrote in message
news:8fc9e52e-1606-44d9-a722-4ae0f29542b6@.s13g2000prd.googlegroups.com...
> Hi all
> I need to perform a summation on a column of a table based on a
> criteria given in another column in the same table. The catch is i
> need to perform different sums according to the number of criterias in
> the criteria column, in a single query. the structure of the table is
> somethinmg like this (only relevant columns are shown)
> TABLE1
> Value - numeric(20,6)
> Month - int
> indicator - bit
> Now i need to do something like:
> SELECT Month, SUM(Value) FROM TABLE1
> WHERE indicator = 1
> GROUP BY Month
> and also
> SELECT Month, SUM(Value) FROM TABLE1
> WHERE indicator = 0
> GROUP BY Month
>
> How can i do this in a single query, something like this:
> SELECT Month, SUM(Value where indicator=1), SUM(Value where
> indicator=0) and so on ......
> Could any body please help me on this ?|||One way:
SELECT Month,
SUM(CASE WHEN indicator = 1 THEN Value ELSE 0 END) AS sum1,
SUM(CASE WHEN indicator = 0 THEN Value ELSE 0 END) AS sum0
FROM TABLE1
GROUP BY Month
HTH,
Plamen Ratchev
http://www.SQLStudio.com|||> It is also a bad idea to use proprietary BIT data types to fake
> assembly language style programming. SQL is a predicate language;
> that is, we discover a fact with a predicate rather than set a flag.
What if the fact IS a flag, and is not based on other data in your database?
e.g. did the patient visit Congo anytime between 1978 and 1989? Instead of
a "flag" saying yes or no, would you rather store each patient's entire
travel itineraries for the period in question? What do you when that is not
available, but the patient does know that he/she was there during that time
period?
You can't answer 8 billion different design questions with your four
standards-based rules. Real world design and development dictates a little
more flexibility than that.|||>> What if the fact IS a flag, and is not based on other data in your database? <<
It might or might not be based on other data; that is the problem. I
don't want to write triggers to constantly update the flags.
>> e.g. did the patient visit Congo anytime between 1978 and 1989? Instead of a "flag" saying yes or no, would you rather store each patient's entire travel itineraries for the period in question? <<
The alternative is a set of flags that cover all of the countries on
Earth, and all of the possible time periods :) Oh, I also hope that
you created exactly the right flags, since they carry so little
data.
Eventually, I will probably have to have his travel itineraries to
follow a disease pattern. I have a friend who is an epidemiologist at
the CDC and that is how they do it.
>> What do you when that is not available, but the patient does know that he/she was there during that time period? <<
You put it the chart as free-form data and then you check his passport
records. Hey, this is the usual missing/fuzzy data problem. What did
you do when you never had a flag for that "bit" of data? What do you
do when the important year was 1990? More bit flags? Have you ever
considered what the triggers and constraints to keep the flags
current?|||> You put it the chart as free-form data and then you check his passport
> records. Hey, this is the usual missing/fuzzy data problem. What did
> you do when you never had a flag for that "bit" of data? What do you
> do when the important year was 1990? More bit flags? Have you ever
> considered what the triggers and constraints to keep the flags
> current?
The fact remains, some things are just that, flags... no further information
necessary, no updating required.|||On Mon, 24 Mar 2008 10:04:23 -0700 (PDT), --CELKO-- wrote:
>Eventually, I will probably have to have his travel itineraries to
>follow a disease pattern. I have a friend who is an epidemiologist at
>the CDC and that is how they do it.
Hi Joe,
But not every organization that collects information is interested in
tracking disease patterns.
I donate blood on a regular basis. And every time I go, I have to fill
out some forms. One of the things they want to know is whether I have
visited the United States during the last 90 days - because if I have, I
can't donate blood at that time and they'll ask me to make a new
appointment. I think it has to do with the possibility of infection with
the West Nile virus.
Anyway, they don't ask me for my travel itinerary or whatever, just one
thing: "have you been in the United Stated during the last 90 days"? and
two check boxes conveniently marked yes and no. How would you model
that?
--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis|||>> "have you been in the United Stated during the last 90 days"? and two check boxes conveniently marked yes and no. How would you model that? <<
They don't care about your trips to AIDS infected Africa? Avian flu
in China? Mad Cow Disease in the UK? I think I would go with three
tables -- Donors for your general demographics, Donations for a
history of donations and a Quarantine list with countries and dates
that would exclude a donation at this time.
I would first check the date of the last donation to be sure that you
are not giving too much and too often. In the US you get recognition
for donating over a gallon, etc. We want a history in the DB.
Then I would ask "what countries have you visited since << insert
date>>?" One of the options could include the Netherlands, altho the
front end might present it as "None" or something similar to the
donor. When the situation changes, I change the Quarantine list.|||On Mar 24, 6:32=A0pm, "Tom Moreau" <t...@.dont.spam.me.cips.ca> wrote:
> Try:
> SELECT
> =A0 =A0 Month
> , =A0SUM(CASE WHEN indicator =3D 0 THEN Value ELSE 0 END) Value0
> , =A0SUM(CASE WHEN indicator =3D 1 THEN Value ELSE 0 END) Value1
> FROM TABLE1
> GROUP BY Month
> --
> =A0 =A0Tom
> ----
> Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
> SQL Server MVP
> Toronto, ON =A0 Canadahttps://mvp.support.microsoft.com/profile/Tom.Moreau=
> "Darsin" <dar...@.gmail.com> wrote in message
> news:8fc9e52e-1606-44d9-a722-4ae0f29542b6@.s13g2000prd.googlegroups.com...
> Hi all
> I need to perform a summation on a column of a table based on a
> criteria given in another column in the same table. The catch is i
> need to perform different sums according to the number of criterias in
> the criteria column, in a single query. the structure of the table is
> somethinmg like this (only relevant columns are shown)
> TABLE1
> Value - numeric(20,6)
> Month - int
> indicator - bit
> Now i need to do something like:
> SELECT Month, SUM(Value) FROM TABLE1
> WHERE indicator =3D 1
> GROUP BY Month
> and also
> SELECT Month, SUM(Value) FROM TABLE1
> WHERE indicator =3D 0
> GROUP BY Month
> How can i do this in a single query, something like this:
> SELECT Month, SUM(Value where indicator=3D1), SUM(Value where
> indicator=3D0) and so on ......
> Could any body please help me on this ?
Hi all,
First of all my apologies for using Month, Value, etc keywords in the
example, will remember to not do it again, and thank you for a prompt
reply.
Secondly, initially i used a temp table to fill the first case and
then updated the temp table with the second case.
now the total number of records are around 50,000 and above, so my
question is will there be a performance (query excute time)
improvement or degradation. In case any furthur details are required
than please do let me know.
Thanks
in case if it there|||> They don't care about your trips to AIDS infected Africa? Avian flu
> in China? Mad Cow Disease in the UK?
No, Joe. Have you ever donated blood in the U.S.? I have donated just
under 3 gallons and I have never been asked about this stuff. There are a
few yes / no questions about certain countries and certain time periods.
There is also a yes / no question about whether I have ever exchanged sex
for money. On that there is not a request for a list of transactions and/or
acts/positions, either. Just a checkbox. I'm sure you would prefer all the
sordid details, but alas, donors are expected to take about 10 minutes to
fill out the questionnaire; it is not a take-home exam with essay questions
and bonus points.
> I would first check the date of the last donation to be sure that you
> are not giving too much and too often. In the US you get recognition
> for donating over a gallon, etc. We want a history in the DB.
But Joe, different states have different organizations that collect blood...
some multiple. So, I could have donated to some blood donor trailer in
Arkansas yesterday, and for some reason want to donate again today in
Boston. Since they can't check every single blood donor organization in the
world, they ask... "have you given blood in the past 90 days?" (They don't
bother asking "have you given blood to this organization in the past 90
days?" because that is something they can verify on their own.) What you
are expecting is like going to the equivalent of the DMV in Tehran or Beirut
or Cairo and expecting them to be able to print you up a new Rhode Island
driver's license.
> Then I would ask "what countries have you visited since << insert
> date>>?"
They do ask that... "what countries have you visited in the past 36 months?"
is the one in my state. But they also want to know about your presence in
certain countries in certain periods of time, e.g. Congo between 1978 and
1982, etc. Now do you really think they are going to leave an open-ended
essay question, require a certified copy of your travel itinerary, etc.? Of
course not. They are relying on you to be honest and check yes / no. They
are going to take your blood and test it anyway, but this kind of screening
is a very simple way to prevent that blood from being taken and having to be
tested.
Does the concept of only needing a yes / no answer really escape you this
much? You are a clever person, and I have to assume this is all for show.
A|||> Quarantine list with countries and dates
> that would exclude a donation at this time.
They are interested in the past 90 days (as Hugo already explained), not a
fixed date range. So, would you have a row for the U.S. for every 90-day
period possible?|||For performance, you can look at indexing. For example, an index on Month
with included columns Value and Indicator may help.
--
Tom
----
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
SQL Server MVP
Toronto, ON Canada
https://mvp.support.microsoft.com/profile/Tom.Moreau
"Darsin" <darsin@.gmail.com> wrote in message
news:73c7bb9f-4c19-4d4f-9e8d-12f29e86f2a2@.d4g2000prg.googlegroups.com...
On Mar 24, 6:32 pm, "Tom Moreau" <t...@.dont.spam.me.cips.ca> wrote:
> Try:
> SELECT
> Month
> , SUM(CASE WHEN indicator = 0 THEN Value ELSE 0 END) Value0
> , SUM(CASE WHEN indicator = 1 THEN Value ELSE 0 END) Value1
> FROM TABLE1
> GROUP BY Month
> --
> Tom
> ----
> Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
> SQL Server MVP
> Toronto, ON Canadahttps://mvp.support.microsoft.com/profile/Tom.Moreau
> "Darsin" <dar...@.gmail.com> wrote in message
> news:8fc9e52e-1606-44d9-a722-4ae0f29542b6@.s13g2000prd.googlegroups.com...
> Hi all
> I need to perform a summation on a column of a table based on a
> criteria given in another column in the same table. The catch is i
> need to perform different sums according to the number of criterias in
> the criteria column, in a single query. the structure of the table is
> somethinmg like this (only relevant columns are shown)
> TABLE1
> Value - numeric(20,6)
> Month - int
> indicator - bit
> Now i need to do something like:
> SELECT Month, SUM(Value) FROM TABLE1
> WHERE indicator = 1
> GROUP BY Month
> and also
> SELECT Month, SUM(Value) FROM TABLE1
> WHERE indicator = 0
> GROUP BY Month
> How can i do this in a single query, something like this:
> SELECT Month, SUM(Value where indicator=1), SUM(Value where
> indicator=0) and so on ......
> Could any body please help me on this ?
Hi all,
First of all my apologies for using Month, Value, etc keywords in the
example, will remember to not do it again, and thank you for a prompt
reply.
Secondly, initially i used a temp table to fill the first case and
then updated the temp table with the second case.
now the total number of records are around 50,000 and above, so my
question is will there be a performance (query excute time)
improvement or degradation. In case any furthur details are required
than please do let me know.
Thanks
in case if it there|||On Mon, 24 Mar 2008 20:25:15 -0700 (PDT), --CELKO-- wrote:
>> "have you been in the United Stated during the last 90 days"? and two check boxes conveniently marked yes and no. How would you model that? <<
>They don't care about your trips to AIDS infected Africa? Avian flu
>in China? Mad Cow Disease in the UK?
Hi Joe,
They don't care about trips to Africa, since AIDS only infects in
specific ways. I do have to fill in some Y/N checkboxes with questions
about my sexual activities (I recall questions about sex with males, for
money, or with drug addicts of the top of my head; there are probably
more), use of non-sterile needles, or reception of blood products.
I don't recall questions about China (it's been a while since my last
donation), but they do care about mad cow disease, because I have to
declare not having visited the UK during a specific time period (in this
case, it's an absolute time period, not relative to the moment of the
blood donation).
> I think I would go with three
>tables -- Donors for your general demographics, Donations for a
>history of donations and a Quarantine list with countries and dates
>that would exclude a donation at this time.
I don't know about blood centres in the US, but here in the Netherlands
the questionnaires for blood donors are designed based on health
inspection and legal requirements, not by some database guy.
The database guy gets to design the database. The input for that
database has to come from the questionnaire.
>I would first check the date of the last donation to be sure that you
>are not giving too much and too often. In the US you get recognition
>for donating over a gallon, etc. We want a history in the DB.
The Dutch blood collecting organisation maintains records of all my
donations. They don't even invite me or allow me to make an appointment
when insufficient time has passed since the last time. However, that is
unrelated to the question I asked you.
>Then I would ask "what countries have you visited since << insert
>date>>?" One of the options could include the Netherlands, altho the
>front end might present it as "None" or something similar to the
>donor. When the situation changes, I change the Quarantine list.
Again, the question list (that does indeed change from time to time as
new risc factors are discovered) is not designed by the DB guy. So I
repeat my question - how would your table design look, given that the
input comes mainly from check boxes ticked by the donor?
--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis|||Surely you aren't expecting an answer Hugo?
I think an example of the form is:
http://www.bloodbook.com/form-donorpre.html
Dozens and Dozens of YES / NO questions; in reality you'd probably have a
questions table and then fk that between the individual and the response -
yes or no; but it would still be a flag - Y or N or a bit 1 or 0.
I'm really looking forward to seeing celko's answer - perhaps he could use
the link above as a real world requirement.
Tony.
--
Tony Rogerson, SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson
[Ramblings from the field from a SQL consultant]
http://sqlserverfaq.com
[UK SQL User Community]
"Hugo Kornelis" <hugo@.perFact.REMOVETHIS.info.INVALID> wrote in message
news:755ju39jums6i3s7bvd14v4r8aigmcgcij@.4ax.com...
> On Mon, 24 Mar 2008 20:25:15 -0700 (PDT), --CELKO-- wrote:
>> "have you been in the United Stated during the last 90 days"? and two
>> check boxes conveniently marked yes and no. How would you model that?
>> <<
>>They don't care about your trips to AIDS infected Africa? Avian flu
>>in China? Mad Cow Disease in the UK?
> Hi Joe,
> They don't care about trips to Africa, since AIDS only infects in
> specific ways. I do have to fill in some Y/N checkboxes with questions
> about my sexual activities (I recall questions about sex with males, for
> money, or with drug addicts of the top of my head; there are probably
> more), use of non-sterile needles, or reception of blood products.
> I don't recall questions about China (it's been a while since my last
> donation), but they do care about mad cow disease, because I have to
> declare not having visited the UK during a specific time period (in this
> case, it's an absolute time period, not relative to the moment of the
> blood donation).
>> I think I would go with three
>>tables -- Donors for your general demographics, Donations for a
>>history of donations and a Quarantine list with countries and dates
>>that would exclude a donation at this time.
> I don't know about blood centres in the US, but here in the Netherlands
> the questionnaires for blood donors are designed based on health
> inspection and legal requirements, not by some database guy.
> The database guy gets to design the database. The input for that
> database has to come from the questionnaire.
>>I would first check the date of the last donation to be sure that you
>>are not giving too much and too often. In the US you get recognition
>>for donating over a gallon, etc. We want a history in the DB.
> The Dutch blood collecting organisation maintains records of all my
> donations. They don't even invite me or allow me to make an appointment
> when insufficient time has passed since the last time. However, that is
> unrelated to the question I asked you.
>>Then I would ask "what countries have you visited since << insert
>>date>>?" One of the options could include the Netherlands, altho the
>>front end might present it as "None" or something similar to the
>>donor. When the situation changes, I change the Quarantine list.
> Again, the question list (that does indeed change from time to time as
> new risc factors are discovered) is not designed by the DB guy. So I
> repeat my question - how would your table design look, given that the
> input comes mainly from check boxes ticked by the donor?
> --
> Hugo Kornelis, SQL Server MVP
> My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis|||>> Have you ever donated blood in the U.S.? <<
I am also a gallon donor and my sister-in-law worked for the Red Cross
blood bank program for a few years.
>> I have donated just under 3 gallons and I have never been asked about this stuff. <<
We had some problems in the US with contaminated blood supplies,
mostly Hepatitis strains. But HIV/AIDS got the public attention.
>> There are a few yes / no questions about certain countries and certain time periods. There is also a yes / no question about whether I have ever exchanged sex for money. On that there is not a request for a list of transactions and/or acts/positions, either. Just a check box. <<
If you get a clean bill of health, can you then give blood again? In
Los Angeles and San Francisco a few years ago, the HIV/AIDS testing
services did give you a dated card as proof. This was important
because the Gay community was trying to collect blood from healthy for
the infected members. This was the "take-home exam with essay
questions and bonus points: for donations.
>> But Joe, different states have different organizations that collect blood .. some multiple. So, I could have donated to some blood donor trailer in Arkansas yesterday, and for some reason want to donate again today in Boston. <<
In the US, the Red Cross is the major one. But I agree that the US
has horrible problems with decentralized medical care (hence HIPAA and
other data and database standards), decentralized school standards,
weird liquor and sex laws, etc.
>> What you are expecting is like going to the equivalent of the DMV in Tehran or Beirut or Cairo and expecting them to be able to print you up a new Rhode Island driver's license. <<
Can I get an International driver's license in Tehran or Beirut or
Cairo? Today, my driving records are checked when I go to change to a
new state -- complete with traffic tickets and possible arrest!
>> They do ask that... "what countries have you visited in the past 36 months?" is the one in my state. But they also want to know about your presence in certain countries in certain periods of time, e.g. Congo between 1978 and 1982, etc. Now do you really think they are going to leave an open-ended essay question, require a certified copy of your travel itinerary, etc.? <<
No, you will get a drop-down list with the (ever changing) quarantine
areas for the appropriate time period. Quick, easy and no more
trouble than any other website form we have come to know and love. It
takes longer to type in your name and address. But say I get thru the
screening and I am contaminated; where is the mechanism to flag me as,
say, a Hep-C carrier when they screen my blood? We need that history
table so I can be barred for life.
>> Does the concept of only needing a yes/no answer really escape you this much? You are a clever person, and I have to assume this is all for show. <<
"Yes/No" is an *answer* to a question. Databases (should) record
simple facts from which *questions* are constructed. Do you see that
concept? A data model is a stylized version of a world and we keep, in
the words of Sgt. Joe Friday, "Just the facts, madam."|||>> Does the concept of only needing a yes/no answer really escape you this
>> much? You are a clever person, and I have to assume this is all for
>> show. <<
> "Yes/No" is an *answer* to a question. Databases (should) record
> simple facts from which *questions* are constructed. Do you see that
> concept? A data model is a stylized version of a world and we keep, in
> the words of Sgt. Joe Friday, "Just the facts, madam."
I give up Joe. You are being a BIT [NOT NULL] of a moron here.|||>> "Yes/No" is an *answer* to a question. Databases (should) record simple
>> facts from which *questions* are constructed. Do you see that concept?
I think you got it backwards. Databases should record simple facts from
which *answers* can be constructed for questions. If "Yes/No" is an answer
to some question, the DBMS must allow for it be recorded so that other
questions can be answered. It is closure, remember?
--
Anith|||>>I think you got it backwards. Databases should record simple facts from which *answers* can be constructed for questions. <<
I do like that better!
>> If "Yes/No" is an answer to some question, the DBMS must allow for it be recorded so that other questions can be answered. It is closure, remember? <<
With that model, a database would already have **all** possible
answers in it before all possible questions are asked. Not possible
even in theory. And the answers are at a different level than the
facts -- the old data and meta data getting mixed together problem,
the need to constantly update the flags when the underlying facts
changes, etc.
Closure in a mathematical system assures that you can compute an
answer from facts, rules, axioms, etc. when presented with a
question.|||> the need to constantly update the flags when the underlying facts
> changes,
When the flag itself is the fact, this makes very little sense.|||>> With that model, a database would already have **all** possible answers
>> in it before all possible questions are asked.
Not all possible answers, but a provision to record any possible answer.
A DBMS derives new facts (query results) from a set of asserted facts (the
database). If the initial assertions are true and derivation rules are
consistent the derived facts are true (i.e. query results are correct).
Therefore as a deductive logic system, when required the DBMS must allow for
the derived facts to be recorded as needed, not that it should record all
possible derivations.
We already do this all the time using views, derived tables, CTEs etc.
>> And the answers are at a different level than the facts -- the old data
>> and meta data getting mixed together problem, the need to constantly
>> update the flags when the underlying facts changes, etc.
Conceptually the answers themselves are facts if the inference rules are
sound. It is we who assume a distinction due to limitations in our tools,
methods and languages.
Yes, the term "flag" carries certain baggage from CS history, but I fail to
see an argument to avoid a domain with two values, in general.
--
Anith|||>> Yes, the term "flag" carries certain baggage from CS history, but I fail to see an argument to avoid a domain with two values, in general. <<
If it is really a domain, I have no problem. I might have problems
with a domain that has only one value, tho. The problem is that a
flag is a computation, predicate, etc. that needs to be derived over
and over from other values. To describe a box, would you store
(length, width, height) then add a column for volume? Of course not
-- it is redundant. Volume by itself would also hide information
about the three variables that went into computing it.|||On Wed, 26 Mar 2008 05:19:03 -0000, Tony Rogerson wrote:
>Surely you aren't expecting an answer Hugo?
Hi Tony,
Not really. Joe has a habit of not answering my question when he
realises that he, once more, has cornered himself. This thread proofs
it: he is still replying to Aaron and Anith, but chooses to ignore this
part of the thread.
Ah well. Silence sometimes speaks louder than words... :-)
>I think an example of the form is:
>http://www.bloodbook.com/form-donorpre.html
Great example. And in fact quite similar to the one I had to fill out
yesterday, when I went to donate blood (or actually blood plasma).
>I'm really looking forward to seeing celko's answer - perhaps he could use
>the link above as a real world requirement.
Surely you aren't expecting an answer Tony?
--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis|||> with a domain that has only one value, tho. The problem is that a
> flag is a computation, predicate, etc. that needs to be derived over
> and over from other values.
A yes/no question does not have to derived / computed / etc. Do you like
lemonade? Yes. Or no. Maybe your answer changes over time. But it does
not depend on any other values whatsoever.
> To describe a box, would you store
> (length, width, height) then add a column for volume? Of course not
> -- it is redundant. Volume by itself would also hide information
> about the three variables that went into computing it.
We're not even talking about the same thing here. You are blindly grasping
for an example of something that IS computed and clearly IS NOT what anyone
in their right mind would consider a "flag"... play fair and honest, celko!|||> We're not even talking about the same thing here. You are blindly
> grasping for an example of something that IS computed and clearly IS NOT
> what anyone in their right mind would consider a "flag"... play fair and
> honest, celko!
create table celko_answer_clutching (
forum_nntp_guid uniqueidentifier not null primary key,
post_author varchar(200) not null,
is_straw char(1) not null check( is_straw = 'Y' or ( post_author <>
'--celko--' and is_straw in ( 'Y', 'N' ) ) ) )
)
Tony Rogerson, SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson
[Ramblings from the field from a SQL consultant]
http://sqlserverfaq.com
[UK SQL User Community]
"Aaron Bertrand [SQL Server MVP]" <ten.xoc@.dnartreb.noraa> wrote in message
news:23520CDC-93AE-4DBF-A2B7-D3E321CFD32B@.microsoft.com...
>> with a domain that has only one value, tho. The problem is that a
>> flag is a computation, predicate, etc. that needs to be derived over
>> and over from other values.
> A yes/no question does not have to derived / computed / etc. Do you like
> lemonade? Yes. Or no. Maybe your answer changes over time. But it does
> not depend on any other values whatsoever.
>> To describe a box, would you store
>> (length, width, height) then add a column for volume? Of course not
>> -- it is redundant. Volume by itself would also hide information
>> about the three variables that went into computing it.
> We're not even talking about the same thing here. You are blindly
> grasping for an example of something that IS computed and clearly IS NOT
> what anyone in their right mind would consider a "flag"... play fair and
> honest, celko!|||>> A yes/no question does not have to derived / computed / etc. Do you like lemonade? Yes. Or no. Maybe your answer changes over time. But it does not depend on any other values whatsoever. <<
But that preference is a value:
CREATE TABLE Foobar
(..
beverage_choice VARCHAR(15) DEFAULT 'water' NOT NULL
CHECK (beverage_choice IN ('lemonade', 'beer', etc.)),
Surely, you would not ask "Do you live in Texas?" "Do you live in
California?" etc. for all 54 states and territories? And don't forget
a constirant to assure that only one of the 54 flags is set to 1.
>> You are blindly grasping for an example of something that IS computed and clearly IS NOT what anyone in their right mind would consider a "flag". <<
I defined a flag as bit that is set to mark an event or set of
conditions (think of Dijkstra's semaphores and other low-level
assembly language tricks). But the event or conditions do not have to
be the agent that sets the flag. Nor does a domain with only two
values have to be a flag. In RDBMS, we want to have the set of
conditions, so we can compute the answer to the question with a search
condition:
SELECT guest_name
FROM Invitations
WHERE beverage_choice = 'lemonade';
Do you see the difference?|||> But that preference is a value:
> CREATE TABLE Foobar
> (..
> beverage_choice VARCHAR(15) DEFAULT 'water' NOT NULL
> CHECK (beverage_choice IN ('lemonade', 'beer', etc.)),
Why? What if the company is Country Time and all they care about is
lemonade?
> Surely, you would not ask "Do you live in Texas?" "Do you live in
> California?" etc. for all 54 states and territories?
No, but if I was a company that could only sell to people in Texas, I might
ask "Do you live in Texas?" exactly once. If you live in some other state,
survey over.
> I defined a flag as bit that is set to mark an event or set of
> conditions (think of Dijkstra's semaphores and other low-level
> assembly language tricks). But the event or conditions do not have to
> be the agent that sets the flag.
But often the condition IS the flag, Joe! Let's say I have a table that
represents a set of servers in my network. At any single point in time, I
run a process that must determine which of the servers are labeled as
"Active"... why is it so wrong to have a column called IsActive or Active,
with the possible values of Y or N? Consider that I don't care about the
history, when it became the current state, etc. I just want to know which
servers are active right now. Sorry, but this is the way I would model it.
You can design it your own way if you want.
> Do you see the difference?
I see the difference, but you either don't understand my point, or are
intentionally ignoring it.|||On Mar 28, 1:57 pm, --CELKO-- <jcelko...@.earthlink.net> wrote:
> >> A yes/no question does not have to derived / computed / etc. Do you like lemonade? Yes. Or no. Maybe your answer changes over time. But it does not depend on any other values whatsoever. <<
> But that preference is a value:
> CREATE TABLE Foobar
> (..
> beverage_choice VARCHAR(15) DEFAULT 'water' NOT NULL
> CHECK (beverage_choice IN ('lemonade', 'beer', etc.)),
> Surely, you would not ask "Do you live in Texas?" "Do you live in
> California?" etc. for all 54 states and territories? And don't forget
> a constirant to assure that only one of the 54 flags is set to 1.
I agree with Aaron.
Let me try a different example. Suppose you are buying life insurance.
Suppose you need to answer a yes/no question "Have you consumed
tobacco products in the last five years?" Your answer to this
question is a fact which might be stored in the database regardless of
your actual smoking history.
In fact, your answer can be matched against your actual smoking
history, and your answers to similar questions in your medical
history. So, your yes/no answers and your actual underlying facts are
just two different kinds of facts. A database designer cannot assume
that huamns always give correct and consistent answers, this is why
both the underlying facts and your current interpretation of them
might need to be stored.|||>> What if the company is Country Time and all they care about is lemonade? <<
They might only keep data on Lemonade drinkers, such as liters per
month drunk. The flag would be redundant.
>> No, but if I was a company that could only sell to people in Texas, I might ask "Do you live in Texas?" exactly once. If you live in some other state, survey over. <<
That is done at input time and no data is being kept in the RDBMS. As
you said, "survey over" and we don't bother with it.
>> Let's say I have a table that represents a set of servers in my network. At any single point in time, I run a process that must determine which of the servers are labeled as "Active"... why is it so wrong to have a column called IsActive or Active, with the possible values of Y or N? Consider that I don't care about the history, when it became the current state, etc. I just want to know which servers are active right now. Sorry, but this is the way I would model it. You can design it your own way if you want. <<
So how do I set this flag? I will be monitoring my network; each
server will have a (login_time, logout_time) pair to show when it is
on-line; the active_flag is set by the search condition
(CURRENT_TIMESTAMP BETWEEN login_time AND logout_time). Since servers
come with clocks, why would anyone want to throw away data they
collect at no cost?
Let me give you an example of a domain with two values:
CREATE TABLE BloodDonors
(donor_id CHAR(9) NOT NULL PRIMARY KEY,
blood_type CHAR(2) NOT NULL
CHECK (blood_type IN ('A', 'B', 'AB', 'O'),
rh_factor CHAR(1) DEFAULT '+' NOT NULL
CHECK (rh_factor IN ('+', '-')),
..);
My Rh factor is a fact. I can validate it with a CHECK() because it
is a very small domain. I can verify (not set, verify) it with blood
test kit. It is not set by an event; it is a simple fact.|||>> Suppose you are buying life insurance. Suppose you need to answer a yes/no question "Have you consumed tobacco products in the last five years?" Your answer to this question is a fact which might be stored in the database regardless of your actual smoking history. <<
So, what happens in 2015 after I have been smoke-free for years? A
properly designed DB would have an application date and would be able
to determine when the value of that column is now "Unknown" and I
might be eligible for a discount. The data extracted from the input
question is "From 2003 to 2008, Joe Celko used some tobacco" and this
might lead to "how many packs a day?" (for my father, that answer was
4 to 5 packs; he only lived to 85).
>> So, your yes/no answers and your actual underlying facts are just two different kinds of facts. <<
Yes! That is one of my points. Flags are at a different and
derivable level of aggregation. The goal of an OLTP database is to
capture the most basic facts and not the aggregations and summaries.
Data Warehouses deal with aggregations.
>> A database designer cannot assume that humans always give correct and consistent answers, this is why both the underlying facts and your current interpretation of them might need to be stored. <<
Agreed. This is why I stress validation and verification so much.
This is why I need to get into Data Quality sometime real soon now.|||> They might only keep data on Lemonade drinkers, such as liters per
> month drunk. The flag would be redundant.
Quit changing my question!
>> No, but if I was a company that could only sell to people in Texas, I
>> might ask "Do you live in Texas?" exactly once. If you live in some
>> other state, survey over. <<
> That is done at input time and no data is being kept in the RDBMS. As
> you said, "survey over" and we don't bother with it.
Not necessarily true! What if the next question is, "Would you ever
consider moving to Texas?"
We can go on and on all day, and I can come up with plenty of questions that
can and should be modeled as T/F, Y/N, 1,0. And you can argue every single
one of them with whatever arguments you want to make up at runtime. You
will still be wrong.
> So how do I set this flag? I will be monitoring my network; each
> server will have a (login_time, logout_time) pair to show when it is
> on-line;
WRONG! I did not say online/offline; it does not have anything to do with
whether the server is "up" or not. That is something that can be detected
at runtime, anyway. Ping server. Response? Up! No response? Down! Why
should I bother storing this, unless I am interested in uptime history? (I
already explained that I am not.) Maybe I am moving the server in or out of
a cluster, or am doing maintenance on it, or maybe I am isolating it to test
the next version of my application. Stop pigeon-holing scenarios into your
pigeon-holed view of how the real world works!
A|||On Mar 28, 4:32 pm, --CELKO-- <jcelko...@.earthlink.net> wrote:
> >> So, your yes/no answers and your actual underlying facts are just two different kinds of facts. <<
> Yes! That is one of my points. Flags are at a different and
> derivable level of aggregation.
Of course not - human answers may and do disagree with the underlying
facts. The fact that you put a check on signed and dated sheet of
paper cannot be derived from your actual smoking history.|||Aaron,
Take a look at Hugo's post - celko is completely ignoring because he can't
answer it.
Here's an example of a blood donor form -
http://www.bloodbook.com/form-donorpre.html - we'd all love to see how celko
would model it.
--
Tony Rogerson, SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson
[Ramblings from the field from a SQL consultant]
http://sqlserverfaq.com
[UK SQL User Community]
"Aaron Bertrand [SQL Server MVP]" <ten.xoc@.dnartreb.noraa> wrote in message
news:A0FDAEBF-E1A5-40AB-AA30-C3D3A06416A1@.microsoft.com...
>> They might only keep data on Lemonade drinkers, such as liters per
>> month drunk. The flag would be redundant.
> Quit changing my question!
>> No, but if I was a company that could only sell to people in Texas, I
>> might ask "Do you live in Texas?" exactly once. If you live in some
>> other state, survey over. <<
>> That is done at input time and no data is being kept in the RDBMS. As
>> you said, "survey over" and we don't bother with it.
> Not necessarily true! What if the next question is, "Would you ever
> consider moving to Texas?"
> We can go on and on all day, and I can come up with plenty of questions
> that can and should be modeled as T/F, Y/N, 1,0. And you can argue every
> single one of them with whatever arguments you want to make up at runtime.
> You will still be wrong.
>> So how do I set this flag? I will be monitoring my network; each
>> server will have a (login_time, logout_time) pair to show when it is
>> on-line;
> WRONG! I did not say online/offline; it does not have anything to do with
> whether the server is "up" or not. That is something that can be detected
> at runtime, anyway. Ping server. Response? Up! No response? Down!
> Why should I bother storing this, unless I am interested in uptime
> history? (I already explained that I am not.) Maybe I am moving the
> server in or out of a cluster, or am doing maintenance on it, or maybe I
> am isolating it to test the next version of my application. Stop
> pigeon-holing scenarios into your pigeon-holed view of how the real world
> works!
> A|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:d7d504b2-9f64-40d8-a76e-dc70051b514f@.8g2000hse.googlegroups.com...
>> A yes/no question does not have to derived / computed / etc. Do you
>> like lemonade? Yes. Or no. Maybe your answer changes over time. But
>> it does not depend on any other values whatsoever. <<
> But that preference is a value:
> CREATE TABLE Foobar
> (..
> beverage_choice VARCHAR(15) DEFAULT 'water' NOT NULL
> CHECK (beverage_choice IN ('lemonade', 'beer', etc.)),
> Surely, you would not ask "Do you live in Texas?" "Do you live in
> California?" etc. for all 54 states and territories? And don't forget
> a constirant to assure that only one of the 54 flags is set to 1.
> SELECT guest_name
> FROM Invitations
> WHERE beverage_choice = 'lemonade';
> Do you see the difference?
Yes, you went from the question, "Do you like FOO?" to "What is your choice
for dinner?"
Two completely different questions.
Nice try though.
Greg Moore
SQL Server DBA Consulting Remote and Onsite available!
Email: sql (at) greenms.com http://www.greenms.com/sqlserver.html|||--CELKO-- wrote:
> Yes! That is one of my points. Flags are at a different and
> derivable level of aggregation. The goal of an OLTP database is to
> capture the most basic facts and not the aggregations and summaries.
> Data Warehouses deal with aggregations.
Why do you assume that all flags are aggregations? Consider, for
instance, software designed to be sold to multiple businesses (rather
than used in-house at a single one), so it has customization options
stored in a one-row table, e.g. whether customer statements should list
old charges individually until paid (open item) or roll them into a
single "previous balance" amount (balance forward). This sort of
yes/no answer is not aggregated from any other facts, but chosen
directly by the user who initially configures the software.|||>> Why do you assume that all flags are aggregations? <<
Not aggregations per se, but things set by events at a different level
of abstraction in the data model. Most of my postings have dealt with
things that should be deduced from simple facts within the schema
("John is eligible for a higher credit rating because he passed one of
several tests (predicates and formulas)" as opposed "We gave John a
higher credit rating by setting a flag and have no idea why!")
>> Consider, for instance, software designed to be sold to multiple businesses (rather than used in-house at a single one), so it has customization options stored in a one-row table, e.g. whether customer statements should list old charges individually until paid (open item) or roll them into a single "previous balance" amount (balance forward). This sort of yes/no answer is not aggregated from any other facts, but chosen directly by the user who initially configures the software. <<
Think about what you just described. Is it data inside the data model
for the schema? Nope. Configuration is SYSTEM LEVEL META DATA! You
cannot get much higher up the chain than that -- this is where
business rules, external legal requirements and stuff like that live.
It is set by the user because the database cannot configure itself at
that level. Typically, you are even beyond the Schema Information
Tables at that level.|||On Mar 31, 12:43 pm, --CELKO-- <jcelko...@.earthlink.net> wrote:
> >> Why do you assume that all flags are aggregations? <<
> Not aggregations per se, but things set by events at a different level
> of abstraction in the data model. Most of my postings have dealt with
> things that should be deduced from simple facts within the schema
> ("John is eligible for a higher credit rating because he passed one of
> several tests (predicates and formulas)" as opposed "We gave John a
> higher credit rating by setting a flag and have no idea why!")
I have provided you some examples when a flag cannot be derived from
other data. You have chosen not to reply.|||The example of a medical questionnaire is very appropriate for me
right now. I am getting a physical on 2008-04-01 and just had to fill
a four-page basic intake questionnaire I got in the mail.
1)The pre-existing conditions are asked as yes/no questions for the
intake form ("Do you have high cholesterol?") so that they can be
measured on an appropriate scale later in the exam (LDL cholesterol,
HDL cholesterol, and triglycerides)
2)The surgery list asks for the calendar year of the operations. Not
just yes/no, not within a range of years past, but the actual calendar
year. They want the fact, not a flag.
3)The family history also asks about the calendar years when family
members were diagnosed for heart problems, cancer, etc. Not just yes/
no, not within a range of years past, but the actual calendar year.
They want the fact, not a flag.
4)The "life style" questions are also detailed and not just flags;
they want measurements.
1.Do you use tobacco? What kind? (cigarettes, cigars, snuff, etc.)
How much?
2.Do you drink alcohol? What kind (beer, wine, liquor, etc.) How many
drinks per week?
3.Do you use caffeine? What kind? (coffee, tea, etc.) How many drinks
per day?
4.How many sex partners do you have? What genders? Animals don't seem
to count:)
5)Male and female conditions are clearly separated to avoid
conflicting data entries. One of the problems with flags is that
certain combinations might not be valid data -- "pregnant men" -- and
you need elaborate CHECK() constraints to avoid bad data. But this is
a Data Quality issue.
This sort of form is for intake only; it is not meant to be a medical
record. The actual database will contain my blood pressure, blood
type, cholesterol level and any tests indicated by the intake form --
not a yes/no flag for "do you have blood:)
Now, we are into data quality issues and the use of scales and
measurement. There standards for the acceptable levels of error and
risk in particular industries. There are measures of "fuzziness" in
data.
While all of this DQ stuff is important, it has little to do with the
use of flags in an RDBMS.|||On Mar 31, 2:28 pm, --CELKO-- <jcelko...@.earthlink.net> wrote:
> The example of a medical questionnaire is very appropriate for me
> right now. I am getting a physical on 2008-04-01 and just had to fill
> a four-page basic intake questionnaire I got in the mail.
> 1)The pre-existing conditions are asked as yes/no questions for the
> intake form ("Do you have high cholesterol?") so that they can be
> measured on an appropriate scale later in the exam (LDL cholesterol,
> HDL cholesterol, and triglycerides)
> 2)The surgery list asks for the calendar year of the operations. Not
> just yes/no, not within a range of years past, but the actual calendar
> year. They want the fact, not a flag.
> 3)The family history also asks about the calendar years when family
> members were diagnosed for heart problems, cancer, etc. Not just yes/
> no, not within a range of years past, but the actual calendar year.
> They want the fact, not a flag.
> 4)The "life style" questions are also detailed and not just flags;
> they want measurements.
> 1.Do you use tobacco? What kind? (cigarettes, cigars, snuff, etc.)
> How much?
> 2.Do you drink alcohol? What kind (beer, wine, liquor, etc.) How many
> drinks per week?
> 3.Do you use caffeine? What kind? (coffee, tea, etc.) How many drinks
> per day?
> 4.How many sex partners do you have? What genders? Animals don't seem
> to count:)
> 5)Male and female conditions are clearly separated to avoid
> conflicting data entries. One of the problems with flags is that
> certain combinations might not be valid data -- "pregnant men" -- and
> you need elaborate CHECK() constraints to avoid bad data. But this is
> a Data Quality issue.
> This sort of form is for intake only; it is not meant to be a medical
> record.
The last attempt: this is not correct. In many cases your answers need
to be stored separately. An insurance companies may void a policy if
an answer is not correct. A resaercher may find it useful to match yes/
no answers against more detailed data. Once upon a time there was a
questionnaire which has the following question:
Do you have sex regularly?
In many cases "yes" meant "every month", and in many other cases "no"
meant "not every day".|||> It is also a bad idea to use proprietary BIT data types to fake
> assembly language style programming. =A0SQL is a predicate language;
> that is, we discover a fact with a predicate rather than set a flag.
Unless we're actually storing a Yes/No, True/False, or some other two-
valued data; as would seem to be indicated by the "INDICATOR" column
name.
INDICATOR, as an aside, is not a reserved word in SQL Server.|||>> Once upon a time there was a questionnaire which has the following question:
Do you have sex regularly? <<
LOL! That is an old Woody Allen joke about a man and woman going to a
therapist and being asked that question:
He: "Almost never, 3 times a week!"
She: "Constantly, 3 times a week!"|||>> Unless we're actually storing a Yes/No, True/False, or some other two-valued data; as would seem to be indicated by the "INDICATOR" column name. <<
I have no trouble with a two-valued domain; I even gave an example of
the Rh factor in blood typing. You just do not see them very often in
the real world.
>> INDICATOR, as an aside, is not a reserved word in SQL Server. <<
But it is for embedded SQL in the X3J languages adn SQL Server has an
embedding even if MS does not advertise it. There is more to the
world of RDBMS than just .NET programming.|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:51589058-a7a2-4a1e-a9c2-46ba44f49d4c@.59g2000hsb.googlegroups.com...
> >> Once upon a time there was a questionnaire which has the following
question:
> Do you have sex regularly? <<
> LOL! That is an old Woody Allen joke about a man and woman going to a
> therapist and being asked that question:
> He: "Almost never, 3 times a week!"
> She: "Constantly, 3 times a week!"
>
OK- since it seems to open-mike nite at the Improv....
Doctor: How's your sex life?
Patient: Infrequent.
Doctor: Is that one word or two?
Bob Lehmann|||> several tests (predicates and formulas)" as opposed "We gave John a
> higher credit rating by setting a flag and have no idea why!")
And do you think the fact that John likes lemonade should be described by
other facts in the database (e.g. what town he spent middle school in)? Or
do you think maybe that is just a fact on its own, not derived from
predicates, formulas, aggregations, or relayed from other facts in any way?|||--CELKO-- wrote:
>> Consider, for instance, software designed to be sold to multiple businesses (rather than used in-house at a single one), so it has customization options stored in a one-row table, e.g. whether customer statements should list old charges individually until paid (open item) or roll them into a single "previous balance" amount (balance forward). This sort of yes/no answer is not aggregated from any other facts, but chosen directly by the user who initially configures the software. <<
> Think about what you just described. Is it data inside the data model
> for the schema? Nope. Configuration is SYSTEM LEVEL META DATA! You
> cannot get much higher up the chain than that -- this is where
> business rules, external legal requirements and stuff like that live.
> It is set by the user because the database cannot configure itself at
> that level. Typically, you are even beyond the Schema Information
> Tables at that level.
I suppose you can define the aforementioned one-row table as not being
part of "the schema". Shrug.
In another message, you write:
> I have no trouble with a two-valued domain; I even gave an example of
> the Rh factor in blood typing. You just do not see them very often in
> the real world.
So what's the difference between a two-valued domain and a flag? In
particular, what if the two values are Yes/No?|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:51589058-a7a2-4a1e-a9c2-46ba44f49d4c@.59g2000hsb.googlegroups.com...
>> Once upon a time there was a questionnaire which has the following
>> question:
> Do you have sex regularly? <<
> LOL! That is an old Woody Allen joke about a man and woman going to a
> therapist and being asked that question:
> He: "Almost never, 3 times a week!"
> She: "Constantly, 3 times a week!"
>
http://en.wikipedia.org/wiki/Coolidge_effect
--
Greg Moore
SQL Server DBA Consulting Remote and Onsite available!
Email: sql (at) greenms.com http://www.greenms.com/sqlserver.html|||>> And do you think the fact that John likes lemonade should be described by other facts in the database <<
How about a list of beverages, with the value Lemonade in it? That
would be nominal scale and not a flag. When we decide to research
other beverages, we extend the scale. Unlike a flag, I can ask how
much of a given beverage he drinks.
I extrapolate that if John drinks s certain number of Cokes and is of
a certain age, then I stand an 85% chance of selling him Pepsi (i.e.
kids the sweeter Pepsi to Coke). You cannot get that kind of
information from flags.|||> How about a list of beverages, with the value Lemonade in it?
As I have asked several times already, what if they are only interested in
lemonade? Why should they bother with a list?
Anyway, you are still clearly either missing the point or intentionally
disregarding it. This is not about lemonade; this is about the fact that
some things are just yes/no indicators on their own, without "help" from
other facts. If you don't get it, you don't get it, and I'm afraid I can't
help to educate you any further. :-(|||> Anyway, you are still clearly either missing the point or intentionally
> disregarding it. =A0This is not about lemonade; this is about the fact tha=t
> some things are just yes/no indicators on their own, without "help" from
> other facts. =A0If you don't get it, you don't get it, and I'm afraid I ca=n't
> help to educate you any further. =A0:-(
To my (mostly) unbiased eye, it looks like both sides are missing the
point (either intentionally or otherwise)... Joe has already said
that he accepts that there are rare cases where a 2 value domain is
valid, but that in *most* cases modelling a flag is not "correct".
But we all know that Joe's only interested in "correct" from a
standards and theoretical perspective though, so it is pointless to
argue the application of flags from a practical perspective.
=46rom a purely theoretical perspective I can see his point ... from a
practical perspective I have no problem disregarding his point if it
makes sense in the context of the problem I'm facing :)
For example (and in keeping with the theme of this thread), I've heard
that in market research it is fairly normal to build a database purely
to record the results of a single survey. This is essentially a throw-
away piece of work that exists only to analyse data from that single
survey, the reason being that it's more cost-effective than building a
"proper" database to store more generic survey results. In this
scenario it is a waste of time to design flexibility into the database
as the analysis is usually predefined by whatever research model the
company is using. The results of the analysis however, would likely
be stored in a well designed database to be compared against past/
future surveys...
J|||>>
> Anyway, you are still clearly either missing the point or intentionally
> disregarding it. This is not about lemonade; this is about the fact that
> some things are just yes/no indicators on their own, without "help" from
> other facts. If you don't get it, you don't get it, and I'm afraid I can't
> help to educate you any further. :-(
To my (mostly) unbiased eye, it looks like both sides are missing the
point (either intentionally or otherwise)... Joe has already said
that he accepts that there are rare cases where a 2 value domain is
valid, but that in *most* cases modelling a flag is not "correct".
Well, to be fair, I did not say that *most* columns should be a two-value
flag, either. But in my experience they are more common than Celko's
"advice" would lead one to believe.|||> Well, to be fair, I did not say that *most* columns should be a two-value
> flag, either. =A0But in my experience they are more common than Celko's
> "advice" would lead one to believe.
Perhaps what I should have said is "in *most* cases where a flag has
been modelled, it is not "correct" to have done so" :)|||>> As I have asked several times already, what if they are only interested in lemonade? Why should they bother with a list? <<
If the only concern is about lemonade preferences and consumption,
wouldn't everything in that table deal with lemonade consumption? If
so, why would they bother with a flag? It would be like a Personnel
table with a flag that asks "Are you an employee?" when the answer
would have to be "yes" to get into the table.
Let me recover a bit from my physical exam, x-rays and booster shots
and see if I can get a short article about scales, measurements and
data values versus question/answer and other types of flags.|||Having earned a living as a statistician, we don't like using SQL for
surveys. We have specialized tools that hide the data storage and
give us computations, special missing value rules which are a bitch to
write in CHECK() constraints, etc.
Preference scales with five degrees (very strong, strong, don't care,
weak, very weak) are better than scales with fewer choices; those with
more than 7 degrees are not as repeatable (ask the same question (n)
day later and the profile changes). Nobody who knows what they are
doing would use Aaron's Y/N on a Lemonade survey.|||> If the only concern is about lemonade preferences and consumption,
> wouldn't everything in that table deal with lemonade consumption?
Celko, you are too much.

No comments:

Post a Comment