Measure & Evaluate The Quality of The Data
Measure & Evaluate The Quality of The Data
In view of the emergency trend to retrieve data from ERP systems and other similar central systems into Excel it has become important to discuss the quality of the retrieved data. Since all data is part of one or more decision processes the quality has an impact on the outputs of these processes.
From my point of view there exists a general dilemma with ERP systems and the process of registration the data, especially when it’s done manually. If the registration is dictating then some of the data may not be valid but since it’s required it’s filled in. If the registration is not dictating, i e users are allowed to leave fields empty, then some data will be left out.
When measuring the quality for data from a dictating system it’s very difficult to locate the less obviously errors (they can be inside a predefined range of values but still be wrong) while it’s much easier when data is left out.
In my experience most corporates ERP are dictating when it comes to order data, production data and staff data. For sales data they prefer less dictating reporting but on the other hand relies on that the data is revised and updated on regular basis to reflect the ongoing changes.
For me the purpose for measuring and evaluating the quality of the retrieved data is two folded:
-By presenting the quality of the data to the decision makers they get a better understanding of the viewed data and can better evaluate the performance indicators.
-By continuously measure the data quality focus can be set to improve it and in the long run it becomes a better input in the decision processes.
Of course, it’s always a trade off between what can be done or desirable and the costs involved.
In my experience it seems that we only need basic indicators for the quality as these kind of indicators can be viewed and understand throughout corporates.
How do You measure and evaluate the quality of retrieved data from ERP systems as well as from other central sources?
Kind regards,
Dennis
Ps: Before anyone asks for a case study I’m not able to provide any. Not even with the most simplified approach, as it would violate the present NDAs.
Jon Peltier:
Dennis -
Thanks for a thoughtful post. In my experience, as soon as data makes it into a chart, a great many people regard it as Gospel, or something handed down from the mountain top.
Only one of my clients takes this seriously, and they use monstrous statistical models to describe uncertain quantities, then run Monte Carlo analyses on these models to compute likely ranges of values. It’s like huge business forecasting. Hmm, I guess NDA dictates that I reveal no more.
Never mind a case study. Qualitatively how do you deal with missing or questionable data? Do you try to fill in blanks with some kind of averages (past performance, say, or data from similar categories)? Do you assign some kind of range, a plus-minus with a certain confidence.
How do you present the tweaked data? Do you put a notation, or change the color in a table or chart? Do you use error bars, or bands, or other visual clues? How do your clients interpret these indicators?
- Jon
24 June 2006, 6:22 amRob van Gelder:
Nice post Dennis - thanks.
What does your ‘quality of data’ report look like when you present them to decision makers?
Cheers,
24 June 2006, 2:12 pmRob
XL-Dennis:
Jon,
Thanks for the input and the questions You raise. I’m not, by any means, an expert in the area, but I’ve learned to keep things simple. Very much of the knowledge I have today is achieved by working with large corporates.
The expression ’Garbish in – Garbish out’ is not necessarily accepted by all companies. I’ve noticed that the larger corporates have a better understanding that the ERPs cannot provide them with better data then what goes into the ERPs. If not the corporates have a driving force to put the data quality on the agenda it’s difficult to raise the question as an external consultant. I’ve done it a few times but without any success. In one company I managed to ‘persuade’ them but I was later ‘replaced’ with the vendor of the ERP when the work was done.
Production data:
Unless it’s new production line(s) and/or new machine(s) there usually exists reliable historical data that can be used.
All ‘tweaked’ data are singled out by using
- different colours (in some cases red is used while in other cases both yellow and red are used and in one particular case grey is used)
- changing the font’s size (due to colour blindness).
Computed indicators
- Standard Mean
- Standard Mean with standard deviation (+/-) which is the most common approach.
- Weighted Mean with ‘weighted’ deviation (+/-).
In one company they calculate the mean within 95 % confidence interval.
When it comes to presentation most data is viewed in tables and charts. As for the chart types they are plotted charts with ‘error’ bars, scatter charts, line charts and column – area charts.
One particular company I know of use only ‘Stock’ charts but please don’t ask me why.
Sales data:
The high degree of uncertainty about the sales as well as sales staff optimistically point of view may not offer any historical data to work with. In general it’s not recommended to use any ‘tweaked’ data.
One way to measure the quality is to divide the sales process in several stages, In that way the sales staff can handle the uncertainty in a better way and also update the data in the system when status changes.
- measure the number of won and lost business that are still in the system.
- measure the number of opportunities that miss important data as well as categorize them into
different groups with refer to missing data.
- measure the number of days (average) in each sale stage.
Depending on volume and stages it can be presented in:
- plain data table(s)
- simple pie chart(s) with percentage view(s)
- Pivottables
In some corporates there exists a ‘chart’ culture while in other corporates it exist a ‘number’ culture when it comes to presenting data. In return it controls how data quality should be presented.
Kind regards,
24 June 2006, 3:02 pmDennis
brettdj:
Thanks for a thoughtful post. In my experience, as soon as data makes it
into a chart, a great many people regard it as Gospel, or something handed
down from the mountain top
Or worse - if that data is presented in a PowerPoint slide to management
then it becames a Key Performance Indicator for that year. As an aside a
trick I learnt early in my career was to present cost ranges backwards, ie
15-10 Million as once management was told the project was 10-15 Million, all
they heard was the 10.
The Monte Carlo method you describe sounds fine to me, it that it is taking
into account uncertainty when looking at outcomes.
“Tweaking” data is another matter. I’d certainly follow Dennis’ comments and
24 June 2006, 5:49 pmeither highlight any tweaked data graphically, or if this made the chart too
busy, I’d add an asterisk and a footnote below.
Hui...:
Companies which are run by people who understand the business don’t generally suffer this problem
The managment know the answer before hand and something out of left field sticks up as wrong
Managment who makes decisions on the results of a spreadsheet, without inherant safe IRR’s and the business understanding of what is proposed are doomed to fail within the errors
Hui…
24 June 2006, 9:39 pmXL-Dennis:
Rob,
When it comes to presentation I’ve been, during the years, influenced by Edward Tufte, Jonathan Koomey and Stephen Few. Combine their ideas with my own opinion of using a minimum of indicators in a black/white UI that fit into one A4-page and without the need to use a magnifying glass.
Depending on the layout and number of indicators I either end up to add the quality indicators together with the critical business indicators on one sheet or in a separate sheet named ‘Quality’.
Nothing fancy, simple charts and not even close to all the multi-coloured and flashy dashboards that seems to be very popular nowadays.
Dave,
Thanks for Your comments and it’s interesting with the point that people only hear one number rather then an intervall.
Hui,
I read Your post several times but I don’t exactly understand what You are aiming at. If managers don’t know their business they will be out of business very soon. That I can agree on but if You imply that ERP always generate 100 % reliable data then I have to disagree with You. Please post a follow up post enabling me to understand Your point better.
Kind regards,
25 June 2006, 4:09 amDennis
Harald Staff:
Man can I relate to this…
A few months ago I went to se a demo of the (at the time) new SQLserver. And after 6 years of various Business Intelligence marketing (”your data represents a gold mine”), it was very relieving to hear an official MS person say: “Raw data is garbage. Until someone washes it, data is useless garbage.”
I built a production scheduling/resource allocation system, tailored for what we do. It has a date field “production start”. And I decided that this should be mandatory, we really need this info. After a while, we saw that a bunch of productions had production start january 1st. This is a holiday, so we asked, and found that the users agreed upon january 1st as meaning “I am not sure”. And by the time they were sure, everybody else knew so why should they update the system ? Would it make sense to have “I am not sure” options in my mandatory fields? Would it make sense to enter a date that’s likely to be wrong ?
In a connected system there’s a “phase” field. Ordered, Reserved, Billed, Cancelled, … After a while it had some new phases, some contained the name of the person in charge of something, some contained a more specific job category. Not really phases, but usefulp info anyway.
Both examples show that the applications do not handle the immediate need of the user, so they use it “wrong” to get todays job done today. Maybe that makes sense. I am not sure.
But the data is garbage.
There is a problem here that I wish I knew how to solve: The ones entering data are almost never the ones that benefit from good data. Data quality is irrelevant to them, the data itself and the entry of it is meaningless, and precision is extra unproductive time. One common solution is to market the importance, where arguments are “it is good for the company” and “if you work harder on this, then my work as a well paid decisionmaker gets even easier”. Yeah. Another solution is brute force, “you do this and you do it perfect”. Many women will do as they are ordered to, most men will perform more like Dilbert characters.
I think that it is impossible to have a system going like “the very minute you know the production start, enter it into the system” and working, unless there is an immediate advantage or reward, some part of the users job is easier, automated, something. In the meantime, data is garbage, by default.
Best wishes Harald
25 June 2006, 4:56 amwho washes Mbs of data by hand/Excel every january
jkpieterse:
I have just finished a reporting systemfor a client support database of a customer.
I used pivot tables and -charts extensively. One thing that the manager was particularly happy with was that he could now walk over to his co-workers and show them what the implications are when they enter xxx for the cause of an issue, instead of just taking the time to enter (select!) a couple of meaningful keywords.
It also made records stand out for which no start or end date was entered or which had other missing data.
My points are:
I don’t think having mandatory fields is a bad thing.
I do think educating the workers on what the information is actually used for is very important.
And this will become even more useful if you can show your workers some sort of progress using KPI’s: average response time, average resolution time, spread in that data related to department and/or customer, etcetera.
Good reporting therefore can help improving the quality of the data tremendously and thus is a knife that cuts two ways: It gives the workers a reason to think about what they enter and if the workers are confronted with the results it also motivates them to do a better job.
Pursuing “quality” data as such is meaningless without a good (and used!) reporting system.
25 June 2006, 12:52 pmXL-Dennis:
Harald,
I believe You point out several important issues that are related to the subject.
As an external consultant I only deal with it from a management perspective which of course limit the insight of the premises for the co-workers.
As for the expression ‘Gold mine’ it may also refer to the great opportunity it offer to make more business for the ERP-vendors (and for me)
Kind regards,
25 June 2006, 2:50 pmDennis
Rob van Gelder:
I can relate.
Quite often, the users entering the data do not benefit. I’ll appear as the guy who forced chores on them.
I try to strike a balance. If I’m introducing (yet another) chore for the nominated data entry person, I’ll try to streamline/automate some other tedious task they also perform, aiming to balance their time (or even make the sum of their tasks quicker)
I wont wash data though. As IT, I’m not capable of seeing the same value in clean data as business users are. I shouldn’t own their data since I cant use it - I’m IT!
I might focus mistakenly on data that is of little business value.
When business users own their data they:
- appreciate the data flows and dependencies.
- batter communicate (with other business users) as the data flows, eliminating IT as the middleman.
- see opportunities for re-use instead of building their own data islands
Rob
25 June 2006, 6:50 pmSam:
I think that when corporates implement ERP’s the inputs of the “end” user…. the very end user …. the guy who actually enters data on the shop floor / customer support guy who updates incoming orders etc are rarely sought for.
The result is the formats/options for fields / reports etc are often decided by people who do not “work” on them.(at the end they just want a nicely formated spreadsheet e-mailed to them)
Secondly in most cases a changes in an ERP report, addtional field in a form etc means “money”… this triggers the bean counters to go in to “is this really critical” mode.
The result… you learn to live with the garbage that comes out of ERP….and clean it manually
/ excel
Regards
25 June 2006, 10:15 pmSam
XL-Dennis:
Jan Karel,
Interesting comments
“I do think educating the workers on what the information is actually used for is very important.”
I fully agree although it’s difficult to achieve it. At a local company with approximated 130 employees they decided to only use 7 KPIs for the production. The KPIs are presented on a weekly basis among all employees including managers. Despite the discussion and information they still have issues with some workers as they simple don’t care.
The point is that, no matter of the size, it will always be an issue and therefore poor data will always exist in the ERPs.
I’m only aware of one company that have approximated 90 % ‘right’ data in their ERP. Their ’success’ is based on ‘management by fear’ and 100 % dictating ERP which I totally disagree on.
Kind regards,
26 June 2006, 8:03 amDennis
jkpieterse:
I see your point. Getting it 100 % right is almost impossible. The only way to get near that percentage is when the people who actually enter the data will also benefit directly from correctly entered information. Which is another goal that is hard to achieve.
26 June 2006, 11:02 amXL-Dennis:
Sometimes I get carried away and my apologize for the following post:
For many years ago I was introduced to Shoshana Zuboff’s excellent book, “In the age of the Smart Machine - The Future of Work and Power”. From time to time I still read some chapters to remind myself about the difficulties when it comes to humans and IT. Here is a link that give some ideas what the book cover: http://www.stanford.edu/~jchong/articles/misc/Zuboff.pdf
Another book I still keep and also re-read is by Thomas H Johnson, “Relevance Regained, From top-down control to bottom-up empowerment”.
We may find them old (first published 1984 respectively 1992) and therefore outdated but the messages they both bring are still valid (at least for me) when it comes to ERP and the subject for this post.
The first book is quite difficult to understand which explain why I still read it with a dictionary not far away from me
A general area is Human-Computer Interaction (HCI) which I must admit I’ve not had the time to fully explore. Most material target WWW but I believe the theory can be applied to Excel and ERPs.
Kind regards,
26 June 2006, 12:55 pmDennis
Jim Thomlinson:
If only we could keep people from touching our systems they would work perfectly. I see two kinds of issues.
1. The data entered was specifically garbage. I once was once compelled took a look through the order entry system for a company I was working for. Turns out we had some very famous client… George Patton, Peter Parker, Bruce Wayne, Elmer Fudd. Is this data garbage? Well we sold it to someone! Oddly enough it worked in my favour to be able to prove that some of our data was garbage. My personal favorite is when the end users provide specific information that they are doing something that they have been told not to do.
2. The data does not lend itself to drawing any conclusions from. A project I am working on now requires me to look for a long term trend. Based on the data that I have I can not say that there is a trend. While the average points down, there is too much variation to say whether that is a trend or an anomoly. To tell someone that the distance from here to there is 5 miles +or- 10 miles means that there is no credibility to the 5 in the first place… In a lot of the reports I see the entire notion of how good is the data is never investigated or anwered.
So how do you filter garbage out? You think of every stupid thing that someone might have done and you remove all of that. You then hope that in the process you have not removed too much good data. There is always a risk when you filter data that you are removing good data. Where is the correct balance. That is as much of an art as it is a science. It depends on what kind of info you are creating…
It is better to provide no information than it is to provide bad information. With no information then the person making the decision knows they are flying by the seat of their pants. Bad information makes decision makes assume things that are just plain wrong.
26 June 2006, 5:01 pmAlex J:
A couple of thoughts:
First, it is important to understand the motivation of individuals who provide data, especially in the realm of forecasting.
For example, salespersons who are incented to provide a flow of orders routinely forecast stuff that just cannot happen because of pressure to provide forecast data which fullfills their quota and their managers’ business plan. The result: the rollup of sales forecasts looks great, but the current period results don’t deliver.
Likewise, project managers often ‘neglect’ to indicate negative or pending negative factors (schedule/cost overruns) in their project reports until it is too late. The result: the rollup of project forecasts looks optimistic, but we keep getting nasty “surprises”.
What to do?
Probably, changing culture can help: its OK to have a project problem - the PM should be praised (not fired) for detecting problems early and pushing for resolution and adaptation of the project plan. Truth in reporting is key.
Review forecast data before reporting - review the sales forecast with the sales MANAGER - he/she will be far more likely to weed out the improbable order forecasts. Truth in reporting is key.
Filter the data: Sales Forecast data in reports is often factored or filtered using probability or confidence. Another way is to take forecast data and defer its impact on the report by using a “time delay”, so that a big order expected in August (say) which has uncertainty might be delayed until October, rather than saying that 75% of the order will still come in during the August period.
Second:
One of the key issues related to portfolio reporting (e.g. all the projects in an area, program, portfolio, business) is that when some data is NOT available in a report, the value of the overall portfolio report is in question. My approach is to use indicators against individual lines ( e.g. - project data not updated this month), and then to count and weigh the impact of these and indicate a summary ‘health indicator’ on the report so that senior managers understand “what isn’t there”. The list of projects with old or missing data also provides a handy list of follow-up actions when the project manager’s name is attached!
Sorry for the long post - hope it provokes more discussion.
27 June 2006, 6:24 amJim:
Enterprise resource planning software, or ERP, doesn’t live up to its acronym. Forget about planning—it doesn’t do much of that—and forget about resource, a throwaway term. But remember the enterprise part. This is ERP’s true ambition. The software attempts to integrate all departments and functions across a company onto a single computer system that can serve all those departments’ particular needs.
Building a single software program that serves the needs of people in finance as well as it does the people in human resources and in the warehouse is a tall order. Each of those departments typically has its own computer system optimized for the particular ways that the department does its work. But ERP combines them all together into a single, integrated software program that runs off a single database so that the various departments can more easily share information and communicate with each other.
That integrated approach can have a tremendous payback if companies install the software correctly.
http://www.cio.com/research/erp/edit/erpbasics.html#erp_abc
you can find the rest of the above quote at the above link
27 June 2006, 7:51 amI found it very helpful
XL-Dennis:
I find it very interesting to learn from You others and also by the way You view it. It shows also how complext the subject actually is.
Thanks for the link to the ABCs of ERP.
It looks like some of You need to do a lot of work by washing the data and also view each line of data to verify it. Does it also mean that You have created Excel-solutions to do the work? If so is it possible for You to share it with us?
“Sorry for the long post - hope it provokes more discussion.”
No need to apologize for a long post and I fully agree that it should provokes for an ongoing discussion
Kind regards,
27 June 2006, 8:23 amDennis
Harald Staff:
Ditto that. Keep them long posts coming, these are important interesting issues and we don’t get to chat about them too often.
My tools are really simple; loops to spot minimum and maximum values, unlikely changes and fields with suspiciously little info. From there the good old Autofilter is my best friend for a long while. My tool then is my “autofilter assistant”, a form displaying filter settings and criteria, with buttons for “active cell value too” and “active cell value instead” to add/switch criteria, and an unlimited save-recall for complete filter settings. That app was really fun to write and, as far as I know, coding it kept me from going insane last winter. But who knows.
Best wishes Harald
27 June 2006, 2:08 pmSam:
To Jim:
Ditto… I agree
“Enterprise :” of the ERP - meaning it helps integrates all departments on a common platform…
Well this can be achived if companies followed a well thought out, well published, enforcable filling system on our PCS / Network Drives
Take a look at any PC today and you are likely to find
a) over crowded desktop
b) Data stored in a irregular manner…
Compare data stored on 2 PCS beloging to the 2 people doing the same job.in a dept… and you will see people store data differently…..name files differently etc
If companies establish a File naming convention, a folder structure etc…. and train people to follow it… Data can be shared on a network drive/PCS across the enterpise (with proper permission)
I belive if this is properly done…. you get the advantage of “Enterpise” in the ERP and a Spreedsheet solutions would start to work just fine…..
Regards
28 June 2006, 1:22 amSam
DM Unseen:
For intersting site about data quality: http://www.arvix.nl/ (some of it is in dutch)
For me Data Quality is tightly linked to information moddeling and business rules/model assertions.
My main tool for investigating difficult quality issues is SQL and patience! For really polluted date(i.e. data that is technically polluted) I use Excel.
Technical data quality means the actual fields have an inconsistent/incorrect domain, ie. contain illegal dates, mixing numbers and text and so forth. Since these issues prevent a load into a well defined database table, I use primarely Excel for this task.
28 June 2006, 2:27 amXL-Dennis:
DM Unseen
Interesting to learn. Does it mean that You first pull out the polluted data into Excel, clean it up, and then load into the dbs?
Kind regards,
28 June 2006, 10:57 amDennis
Jim Thomlinson:
I like pivot tables for cleaning data. I will pull a query from a main transactional database and pump into an Access database (a quick and dirty area stored locally all my own). I will then create a couple of queries to filter out specific types of transactions. I use that as the basis for my pivot tables. I then pivot the data around and look at it from a dozen different angles aggregated in a host of different ways. Based on that I can usually find assorted bits of weirdness that I will use as a basis to fine tune my queries and remove the garbage that has accumulated. The nice thing about pivots is that:
29 June 2006, 5:17 pm-I am not confined to 65,536 record
-I can very quickly look at huge volumes of stuff
-I can view data in a heirarchy and find errors in differnet dimensions. You can see anomolies in transactions over time or by geography, or by department or…
-I can modify the source query that feed the pivot and ensure that my query removes only the garbage I intended.
-I can zoom into the source transactions using the double click to see the underlying data.
Alex J:
Has anyone got experience using SPC (Statistical Process Control) techniques in this context?
I know enough about the subject to know that the exact math techniques will not be valid to apply for many of the data we look at, but I’ve been wondering if the general approach to calculating “Control Limits” based on the content of the data could be used as guidelines for validation. I would be nice to use something like this to (partially) automate the kind of approach Harald is referring to in comment 19 above.
Are there similar techniques which would be valid for “non-random” data generated in projects, for instance?
Color me Statistically Curious.
30 June 2006, 6:16 amDM Unseen:
Dennis, Jim,
I do similar stuff that jim does, but he does have a point in the excel limit.
30 June 2006, 6:18 amI usually use Excel formulas and pivottables to find anomalies, and then repair them in in source system with SQL if I can. If the soucre is a textfile (loadfile for a DB) I repair the issues using Excel and then load into a database. For very large sets (where I cannot depend on getting a representative subset of the data in excel). I usually try to stick with SQL server and not Access.
XL-Dennis:
Jim / DM Unseen
Thanks for Your kind input and in view of what other have pointed out one conclusion is that ‘behind the scene’ lot of maintain and quality assurance is done. Excel 2007 seems to be a welcome tool to work with large sets of data.
Kind regards,
1 July 2006, 3:45 amDennis
Remco:
Hi Dennis,
Intriguing post. One could write a book about the topics you brought up here. Given the diversity of data it is not possible to give a single answer that covers everything, it will depend on the type of data and the goals you are trying to achieve.
But in light of the ‘dirty data’ discussion I would like to mention that even dirty data is sometimes usable and shouldn’t be eliminated. I’ll illustrate with a real life example.
The company that I work for manufactures several types of machines. One of the things management would like to know is how much time does it take to assemble machine X and machine Y. In ERP language this is called ‘Routing’. After all, labor is part of the equation that determines the prices of products.
About two years ago I had to start a project for entering the routing data in the ERP system and do something useful with it. Mind you that management simply asked me to find out how long it takes to assemble machine X, ignorant of the complexity of such a task.
In an ERP system a unique Work Order (number) is issued for each separate job.
Of course I ran into several problems (excuse me; challenges) . Here’s a few of them relevant to this topic:
1) how to register the worked time for assembling machine X?
2) who should enter that data in the ERP system?
3) How to interpret the entered data in the EPR system?
Ad 1) We decided to let the employees write down the time they worked on each work order. As mentioned in other posts it is important to let them know that it is necessary to do this is thorough as possible.
The very first impression they got was that they were being checked up on, to make sure they weren’t spending to much time on an certain work orders. You’ll understand that this invites for creative administrating, or worse hurrying a job, so it was very important to set this straight.
Ad 2) Entering routing data in the EPR system is, well, tacky to say the very least. After two months of peddling the mud I wrote an Oracle database (Oracle was already there because the EPR system uses it) to store the routing data and created an Excel application for entering the routing data into the database. The Excel application retrieves data from the ERP system (like work order details and employee names), ensures data consistency of the routing data and allows me run some checks before writing it to the Oracle database.
We now let the employees enter the data in the Excel application because it gives them a welcome break from their normal work.
Ad 3) I created several reports with a reporting tool to filter certain work orders or hours. B. I also created a fairly complex workbook that calculates the mean assembly time for machine X. To do this it first retrieves a list of finished work orders that were issued to assemble machine X. The thing here is that I don’t filter the data! If it would take 4 hours to assemble machine X, then most work orders would be in the range 3.5 – 4.5 hours. But there are also work orders that took only 3 hours to complete and some took 6 hours to complete.
My statement is now as follows: by filtering out the excesses I am in fact polluting the data.
The idea behind this is that the data represents the real situation. There probably was a work order that took 6 hours to complete. Maybe the employee in question was sleepy on Monday morning, maybe his coffee break was to long or maybe he just made a mistake reading his watch or writing it down. It doesn’t really matter. This employee works 8 hours a day, he wrote 6 hours on this particular work order and he wrote the remaining 2 hours on another work order. If I eliminate the 6 hour work order from my calculation then the calculated average would be lower and the machine is sold for a to low a price. In addition I’d see on the weekly chart that this employee only worked (wrote) 34 hours.
Quality in this case would mean that each employee writes 40 hours when he worked 40 hours.
As for the quality of the calculated average; the more work orders are used for the calculation, the better it represents the actual average. I use a standard deviation to calculate the quality of an average. In simple language; if my average is based on just 1 work order then it would be foolish to use that number to calculate labor cost. If it’s based on 10 work orders it’s fairly safe to use that number, even if there’s a considerable spread in the data.
So there you go, an example where messing up the quantity of data messes up the quality of data.
Remco
3 July 2006, 5:28 amXL-Dennis:
Remco,
My basic idea for the post itself was to put this subject to the agenda and see how other deals with it. Of course, in view all the answers I guess it would require at least 2 ‘bibles’ about the subject
Anyway,
“So there you go, an example where messing up the quantity of data messes up the quality of data.”
Many thanks for the input and I see what Your point is which also (again) reflect the complexity when it comes to actual real cases.
Thanks and all the very best from,
6 July 2006, 9:52 amDennis