SlideShare a Scribd company logo
Lecture 5 Estimation Estimate size, then Estimate effort, schedule and cost from size &  complexity CS 568
Project Metrics Cost and schedule estimation  Measure progress  Calibrate models for future estimating Metric/Scope Manager Product Number of projects x number of metrics = 15-20
Approaches to Cost Estimation By experts By analogies Decomposition Parkinson’s Law; work expands to fill time Pricing to win: customer willingness to pay Lines of Code Function Points Mathematical Models: Function Points & COCOMO
Time Staff-month T theoretical 75% * T theoretical Impossible design Linear increase Boehm: “A project can not be done in less than 75% of theoretical time” T theoretical  = 2.5 *  3 √staff-months But, how can I  estimate staff months?
PERT estimation Mean Schedule date = [ earliest date + 4 likely date + latest date] /6 Variance = [latest date –earliest date]/6 This is a  β   distribution.
Example If  min = 10 months mode = 13.5 months max =  20 months, then Mean = 14 months Std. Dev = 1.67 months
Probability Distributions
See  www.brighton-webs.co.uk/distributions/beta.asp   The mean, mode and standard deviation in the above table are derived from the minimum, maximum and shape factors which resulted from the use of the PERT approximations. 15.97 15.17 Q3 (75%) 14.30 13.91 Q2 (50% - Median) 12.96 12.75 Q1 (25%) 2.07 1.67 Standard Deviation 13.5 13.65 Mode 14.50 14.00 Mean Triangular Beta  
Sizing Software Projects Effort = (productivity) -1  (size) c productivity  ≡  staff-months/ksloc  size ≡ ncksloc c is a function of staff skills Staff months Lines of Code or  Function Points 500
Understanding the equations Consider a transaction project of 38,000 lines of code, what is the shortest time it will take to develop?  Module development is about 400 SLOC/staff month Effort = (productivity) -1  (size) c   =  (1/.400 KSLOC/SM) (38 KSLOC) 1.02   = 2.5 (38) 1.02  ≈ 100 SM Min time = .75 T= (.75)(2.5)(SM) 1/3  ≈  1.875(100) 1/3   ≈  1.875 x 4.63 ≈  9 months
How many software engineers? 1 full time staff week = 60 hours, half spent on project (30 hours) 1 student week  = 20 hours. Therefore, an estimation of 100 staff months is actually 150 student months. 150 staff months/5 months/semester =  30 student software engineers, therefore simplification is mandatory
Lines of Code LOC ≡  Line of Code KLOC ≡ Thousands of LOC KSLOC ≡ Thousands of Source LOC NCSLOC ≡ New or Changed KSLOC
Productivity per staff-month: 50 NCSLOC for OS code (or real-time system) 250-500 NCSLOC for intermediary applications (high risk, on-line) 500-1000 NCSLOC for normal applications (low risk, on-line) 10,000 – 20,000 NCSLOC for reused code Reuse note: Sometimes, reusing code that does not provide the exact functionality needed can be achieved by reformatting input/output.  This decreases performance but dramatically shortens development time. Bernstein’s rule of thumb for small components
“ Productivity” as measured in 2000 3 x code for customized Code for reuse 17 – 105 NCSLOC/sm 1000-2000 NCSLOC/sm New embedded flight software ( customized) Reused Code 244 – 325 NCSLOC/sm Evolutionary or  Incremental approaches ( customized) 130 – 195 NCSLOC/sm Classical rates
QSE Lambda Protocol  Prospectus Measurable Operational Value Prototyping or Modeling sQFD Schedule, Staffing, Quality Estimates ICED-T Trade-off Analysis
Universal Software Engineering Equation Reliability (t) =  ℮   -k   t when the error rate is constant and where k is a normalizing constant for a software shop and  = Complexity/ effectiveness x staffing
Post-Release Reliability Growth in Software Products Author: Pankaj Jalote ,Brendan Murphy, Vibhu Saujanya Sharma Guided By: Prof. Lawrence Bernstein Prepared By: Mautik Shah
Introduction  The failure rate of software products decreases with time, even when there  no software changes are being made.  This violates our intuition where there is a growth in reliability without any fault removal. Modeling this reliability growth in the initial stages after product release is the focus of this paper.
Three possible reasons: Users learn to avoids  faults that cause failure and a failure is never random.  After Initially exploring many different features and options, users choose a  small set of product features, thereby reducing the number of fault carrying paths that are actually exercised. Installing new software onto existing systems often results in versioning and configuration issues which cause failures.
Failure rate model
Using product support data
Using data from Automated Reporting
Product stabilization time Stabilization time indicates  the product’s transient defects as well as the user experience. A smaller value of stabilization time means that the end users will have fewer troubles.  If the steady state failure rate of a product is acceptable, then instead of investing in system testing the vendor may need to focus on improving issues related to installation, configuration, usage, etc. to reduce stabilization time A high stabilization time will require a different strategy for improving the user experience than is needed for dealing with a high steady state failure rate of a product. .
Conclusion Traditional software reliability models generally assume that software reliability is primarily a function of the fault content and remains unchanged if the software is unchanged. But, the failure rate often gets smaller with time, even without any changes being made to the product. T This may be due to users learning to avoid the situations that cause failures, using  a limited amount of features functionality or  resolving configuration issues, etc. Stabilization time is the time it takes after installation for the failure rate to reach its steady state value.  For an organization which plans to have its employees use a software product, the stabilization time could indicate the period after which the organization could expect the production usage of the product.
Derivation of Reliability Equation valid after the stabilization intereval.  Let T be the stabilization time, then g(T) is some constant failure rate, F. . To convert from a rate to a time function we need to intergrate the Fourier transform: R(t-T) = ∫ g( ω ) exp(- λ (t-T)) from o to ∞, With g(w) is a constant F and  τ = t-T R( τ )= F exp(-  λτ )  and λ  = complexity/effective staffing
Function Point (FP) Analysis Useful during requirement phase Substantial data supports the methodology Software skills and project characteristics are accounted for in the Adjusted Function Points  FP is technology and project process dependent so that technology changes require recalibration of project models. Converting Unadjusted FPs (UFP)  to LOC for a specific language (technology) and then use a model  such as COCOMO. (start here)
Productivity= f (size) Function Points Bell Laboratories  data Capers Jones data Productivity (Function points / staff month)
Adjusted Function Points Accounting for Physical System Characteristics Characteristic Rated by System User 0-5 based on “degree of influence” 3 is average Unadjusted Function Points (UFP) General System Characteristics (GSC) X = Adjusted Function Points (AFP) AFP  =  UFP (0.65 + .01*GSC),  note GSC = VAF= TDI Data Communications Distributed Data/Processing Performance Objectives Heavily Used Configuration Transaction Rate On-Line Data Entry End-User Efficiency On-Line Update Complex Processing Reusability Conversion/Installation Ease Operational Ease Multiple Site Use Facilitate Change
Function Point Calculations Unadjusted Function Points  UFP= 4I + 5O + 4E + 10L + 7F,  Where I ≡  Count of input types that are user inputs and change data structures.  O ≡ Count of output types E ≡ Count of inquiry types or  inputs controlling execution.  [think menu selections] L ≡ Count of logical internal files, internal data used by system  [think index files; they are group of logically related data entirely within the applications boundary and maintained by external inputs . ] F ≡  Count of interfaces data output or shared with another application Note that the constants in the nominal equation can be calibrated to a specific software product line.
Complexity Table 10 7 5 INTERFACES (F) 15 10 7 LOG INT (L) 6 4 3 INQUIRY(E) 7 5 4 OUTPUT(O) 6 4 3 INPUT (I) COMPLEX AVERAGE SIMPLE TYPE:
Complexity Factors 1. Problem Domain ___ 2. Architecture Complexity  ___ 3.  Logic Design -Data  ___ 4. Logic Design- Code ___ Total  ___  Complexity = Total/4 = _________
Problem Domain   Measure of Complexity (1 is simple and 5 is complex) All algorithms and calculations are simple. Most algorithms and calculations are simple. Most algorithms and calculations are moderately complex. Some algorithms and calculations are difficult. Many algorithms and calculations are difficult. Score ____
Architecture Complexity Measure of Complexity (1 is simple and 5 is complex) 1. Code ported from one known environment to another.  Application does not change more than 5%. 2. Architecture follows an existing pattern.  Process design is straightforward.  No complex hardware/software interfaces. 3. Architecture created from scratch.  Process design is straightforward.  No complex hardware/software interfaces. 4. Architecture created from scratch.  Process design is complex.  Complex hardware/software interfaces exist but they are well defined and unchanging. 5. Architecture created from scratch.  Process design is complex.  Complex hardware/software interfaces are ill defined and changing.  Score ____
Logic Design -Data Score ____ Simple well defined and unchanging data structures.  Shallow inheritance in class structures.  No object classes have inheritance greater than 3. Several data element types with straightforward relationships.  No object classes have inheritance greater than  Multiple data files, complex data relationships, many libraries, large object library.  No more than ten percent of the object classes have inheritance greater than three.  The number of object classes is less than 1% of the function points Complex data elements, parameter passing module-to-module, complex data relationships and many object classes has inheritance greater than three.  A large but stable number of object classes. Complex data elements, parameter passing module-to-module, complex data relationships and many object classes has inheritance greater than three.  A large and growing number of object classes.  No attempt to normalize data between modules
Logic Design- Code Score __ Nonprocedural code (4GL, generated code, screen skeletons).  High cohesion.  Programs inspected.  Module size constrained between 50 and 500 Source Lines of Code (SLOCs). Program skeletons or patterns used.  ).  High cohesion.  Programs inspected.  Module size constrained between 50 and 500 SLOCs.  Reused modules.  Commercial object libraries relied on.  High cohesion.  Well-structured, small modules with low coupling.  Object class methods well focused and generalized.  Modules with single entry and exit points.  Programs reviewed. Complex but known structure randomly sized modules.  Some complex object classes.  Error paths unknown.  High coupling. Code structure unknown, randomly sized modules, complex object classes and error paths unknown.  High coupling.
Computing Function Points See http://www.engin.umd.umich.edu/CIS/course.des/cis525/js/f00/artan/functionpoints.htm
Adjusted Function Points- review Now account for 14 characteristics on a 6 point scale (0-5) Total Degree of Influence (DI) is sum of scores. DI is converted to a technical complexity factor (TCF) TCF = 0.65 + 0.01DI Adjusted Function Point is computed by   FP = UFP X TCF For any language there is a direct mapping from  Unadjusted   Function Points to LOC Beware function point counting is hard and needs special skills
Function Points Qualifiers Based on counting data structures  Focus is on-line data base systems Less accurate for WEB applications Even less accurate for Games, finite state machine and algorithm software Not useful for extended machine software and compliers An alternative to NCKSLOC because estimates can be based on requirements and design data.
Function Point pros and cons Pros: Language independent Understandable by client Simple modeling Hard to fudge Visible feature creep  Cons: Labor intensive Extensive training  Inexperience results in inconsistent results Weighted to file manipulation and transactions Systematic error introduced by single person, multiple raters advised
Initial  Conversion http://www.qsm.com/FPGearing.html 42 Visual Basic 50 J2EE 60 Perl 59 JAVA 42 HTML 53 C++ 104 C Median SLOC/ UFP  Language
 
 
SLOC 78  UFP  * 53 (C++ )SLOC /  UFP  = 4,134 SLOC  ≈  4.1 KSLOC . (Reference for SLOC per function point: http://www.qsm.com/FPGearing.html )
Expansion Trends Expansion Factor Technology Change: Regression Testing 4GL Small Scale Reuse Machine Instructions High Level Languages Macro Assemblers Database Managers On-Line Dev Prototyping Subsec Time Sharing Object Oriented Programming Large Scale Reuse Order of Magnitude Every Twenty Years Each date is an estimate of widespread use of a software technology The ratio of Source line  of code to a  machine  level line of  code
Heuristics to do Better Estimates Decompose Work Breakdown Structure to lowest possible level and type of software. Review assumptions with all stakeholders Do your homework - past organizational experience Retain contact with developers Update estimates and track new projections (and warn) Use multiple methods Reuse makes it easier (and more difficult) Use ‘current estimate’ scheme
Heuristics to meet aggressive schedules Eliminate features Simplify features & relax specific feature specifications Reduce gold plating Delay some desired functionality to version 2 Deliver functions to integration team incrementally Deliver product in periodic releases
Specification for Development Plan Project Feature List Development Process Size Estimates Staff Estimates Schedule Estimates Organization Gantt Chart
COCOMO COnstructive COst MOdel Based on Boehm’s analysis of a database of 63 projects - models based on regression analysis of these systems Linked to classic waterfall model Effort is number of Source Lines of Code (SLOC) expressed in thousands of delivered source instructions) - excludes comments and unmodified utility software
COCOMO Formula Effort in staff months =a*KDLOC b 1.20 3.6 embedded 1.12 3.0 semi-detached 1.05 2.4 organic b a
A Retrospective on the Regression Models They came to similar conclusions: Time: Watson-Felix  T = 2.5E  0. 35 COCOMO(organic) T = 2.5E  0. 38 Putnam T = 2.4E  0. 33 Effort: Halstead E = 0.7 KLOC  1.50 Boehm E = 2.4 KLOC  1.05 Watson-Felix E = 5.2 KLOC  0.91
Initial  Conversion http://www.qsm.com/FPGearing.html 42 Visual Basic 50 J2EE 60 Perl 59 JAVA 42 HTML 53 C++ 104 C Median SLOC/function point Language
Delphi Method A group of experts can give a better estimate The Delphi Method: Coordinator provides each expert with spec Experts discuss estimates in initial group meeting Each expert gives estimate in interval format: most likely value and an upper and lower bound Coordinator prepares summary report indicating group and individual estimates Group iterates until consensus
Function Point Method External Inputs External Outputs External Inquiries Internal Logical Files External Interface Files External  Input External  Inquiry External  Output Internal Logical  Files External  Interface  File Five key components are  identified based on logical user view Application
Downside Function Point terms are confusing Too long to learn, need an expert  Need too much detailed data Does not reflect the complexity of the application Does not fit with new technologies  Takes too much time  “ We tried it once”
For each component compute a Function Point value based on its make-up and complexity of its data Complexity Record Element  Types Data Elements (# of unique data fields) or   File  Types Referenced Low Average High Low Low Average High Average High Components: Low Avg .  High Total Internal Logical File  (ILF) __ x 7  __ x 10  __ x 15 ___ External Interface File  (EIF) __ x 5  __ x  7  __ x 10 ___ External Input  (EI) __ x 3  __ x  4  __ x  6 ___ External Output  (EO) __ x 4  __ x  5  __ x  7 ___ External Inquiry  (EQ) __ x 3  __ x  4  __ x  6 ___ ___ Total Unadjusted FPs Data  Relationships 1 3 3
When to Count CORRECTIVE MAINTENANCE PROSPECTUS ACHITECTURE  TESTING DELIVERY REQUIREMENTS IMPLEMENTATION SIZING SIZING Change Request Change Request SIZING SIZING SIZING SIZING
: Technology (tools, languages, reuse, platforms) Processes including tasks performed, reviews, testing, object oriented  Customer/User and Developer skills  Environment including locations & office space System type such as information systems; control systems, telecom, real-time, client server, scientific, knowledge-based, web Industry such as automotive, banking, financial, insurance, retail, telecommunications, DoD Estimates vary f{risk factors}
Using  the equations For a 59 function  point project to be written in C++, we need to write 59 x 53 = 3127 SLOC Effort = (productivity) -1  (size) c   =  [ 1/(.9 x 53 KSLOC/SM)] (3.127 KSLOC) 1.02   = 2.1 (3.127 ) 1.02     = 2.1 (3.127 ) 1  (3.127 ) .02   ≈  7 SM
Baseline current performance  levels  PERFORMANCE PRODUCTIVITY CAPABILITIES PERFORMANCE SOFTWARE PROCESS IMPROVEMENT TIME TO MARKET EFFORT DEFECTS MANAGEMENT SKILL LEVELS PROCESS TECHNOLOGY PRODUCTIVITY IMPROVEMENT INITIATIVES / BEST PRACTICES RISKS MEASURED BASELINE 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 0 100 200 400 800 1600 3200 6400 Sub Performance Best Practices Industry Averages Organization Baseline
Modeling Estimation SIZE REQUIREMENT REQUIREMENT Analyst SELECT MATCHING  PROFILE GENERATE  ESTIMATE WHAT IF ANALYSIS Counter Project Manager Software PM / User Metrics Database Plan vs. Actual Report Profile Size  Time The estimate is based on  the best available information. A poor requirements document will result in a poor estimate  Accurate estimating is a function  of using historical data with an effective estimating process. ESTABLISH PROFILE ACTUALS
Establish a baseline Performance Productivity A representative selection of projects is measured Size is expressed in terms  of functionality delivered to the user Rate of delivery is a  measure of productivity  Organizational Baseline 9 Rate of Delivery Function Points per Staff Month 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Software Size
Monitoring improvements Track Progress Second year Rate of Delivery Function Points per Person Month 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Software Size
Brooks Calling the Shot Do not estimate the whole task by estimating coding and multiplying by 6 or 9! Effort increases as a power of size  Unrealistic assumptions about developer’s time - studies show at most 50% of the time is allotted to development Productivity is also related to complexity of the task, more complex, less lines/year - high level languages & reuse critical

More Related Content

PPT
Estimation
PPTX
Use case point ( Software Estimation Technique)
PPT
Software Estimation Part I
PPTX
Estimation techniques and software metrics
PPT
Software cost estimation
PPTX
Software cost estimation
PPT
Software cost estimation project
PPT
Software Project Cost Estimation
Estimation
Use case point ( Software Estimation Technique)
Software Estimation Part I
Estimation techniques and software metrics
Software cost estimation
Software cost estimation
Software cost estimation project
Software Project Cost Estimation

What's hot (20)

PPTX
Se 381 - lec 25 - 32 - 12 may29 - program size and cost estimation models
PPTX
Software cost estimation
PPTX
Issues in software cost estimation
PPT
Cocomo
PPT
Metrics
PDF
Spm software effort estimation
ODP
Software Measurement: Lecture 1. Measures and Metrics
PPT
Cocomo models
PPTX
Estimation Techniques V1.0
PPT
Software Estimation Techniques
PPSX
Software Estimation
PPT
use case point estimation
PPTX
Effort estimation( software Engineering)
PPT
Lecture5
PPTX
Halsted’s Software Science-An analytical technique
PPTX
Cost estimation using cocomo model
PDF
Guide to Software Estimation
PPT
Ch26
PDF
Complexity metrics and models
PPTX
Software estimation techniques
Se 381 - lec 25 - 32 - 12 may29 - program size and cost estimation models
Software cost estimation
Issues in software cost estimation
Cocomo
Metrics
Spm software effort estimation
Software Measurement: Lecture 1. Measures and Metrics
Cocomo models
Estimation Techniques V1.0
Software Estimation Techniques
Software Estimation
use case point estimation
Effort estimation( software Engineering)
Lecture5
Halsted’s Software Science-An analytical technique
Cost estimation using cocomo model
Guide to Software Estimation
Ch26
Complexity metrics and models
Software estimation techniques
Ad

Similar to Cs 568 Spring 10 Lecture 5 Estimation (20)

PPT
Software Cost Estimation in Software Engineering SE23
PPT
8 project planning
PPTX
Project Estimation
PPTX
SE-Lecture-7.pptx
PPTX
Software Engineering Fundamentals in Computer Science
PPT
Managing software project, software engineering
PPTX
Project Scheduling and Tracking in Software Engineering.pptx
PPTX
UNIT 2-APPLYING THE SOFTWARE COST ESTIMATION.pptx
PPTX
Software Engineering Chapter 4 Part 1 Euu
PPT
software effort estimation
PPTX
Group-5-presentation_SPM, here is deatiled version.pptx
PPT
Iwsm2014 lies damned lies & software metrics (charles symons)
PPT
cost factor.ppt
PPT
spm cost estmate slides for bca 4-195245927.ppt
PPT
Software Metrics
PPT
Pm Scheduling Cost Pricing
PPTX
Software cost factors in software engineering.pptx
PPTX
Lec_6_Sosssssftwaaaaaare_Estimation.pptx
PPT
OOSE Unit 2 power point presentation developed by Dr.P.Visu
PPT
OOSE Unit 2 PPT.ppt
Software Cost Estimation in Software Engineering SE23
8 project planning
Project Estimation
SE-Lecture-7.pptx
Software Engineering Fundamentals in Computer Science
Managing software project, software engineering
Project Scheduling and Tracking in Software Engineering.pptx
UNIT 2-APPLYING THE SOFTWARE COST ESTIMATION.pptx
Software Engineering Chapter 4 Part 1 Euu
software effort estimation
Group-5-presentation_SPM, here is deatiled version.pptx
Iwsm2014 lies damned lies & software metrics (charles symons)
cost factor.ppt
spm cost estmate slides for bca 4-195245927.ppt
Software Metrics
Pm Scheduling Cost Pricing
Software cost factors in software engineering.pptx
Lec_6_Sosssssftwaaaaaare_Estimation.pptx
OOSE Unit 2 power point presentation developed by Dr.P.Visu
OOSE Unit 2 PPT.ppt
Ad

Cs 568 Spring 10 Lecture 5 Estimation

  • 1. Lecture 5 Estimation Estimate size, then Estimate effort, schedule and cost from size & complexity CS 568
  • 2. Project Metrics Cost and schedule estimation Measure progress Calibrate models for future estimating Metric/Scope Manager Product Number of projects x number of metrics = 15-20
  • 3. Approaches to Cost Estimation By experts By analogies Decomposition Parkinson’s Law; work expands to fill time Pricing to win: customer willingness to pay Lines of Code Function Points Mathematical Models: Function Points & COCOMO
  • 4. Time Staff-month T theoretical 75% * T theoretical Impossible design Linear increase Boehm: “A project can not be done in less than 75% of theoretical time” T theoretical = 2.5 * 3 √staff-months But, how can I estimate staff months?
  • 5. PERT estimation Mean Schedule date = [ earliest date + 4 likely date + latest date] /6 Variance = [latest date –earliest date]/6 This is a β distribution.
  • 6. Example If min = 10 months mode = 13.5 months max = 20 months, then Mean = 14 months Std. Dev = 1.67 months
  • 8. See www.brighton-webs.co.uk/distributions/beta.asp The mean, mode and standard deviation in the above table are derived from the minimum, maximum and shape factors which resulted from the use of the PERT approximations. 15.97 15.17 Q3 (75%) 14.30 13.91 Q2 (50% - Median) 12.96 12.75 Q1 (25%) 2.07 1.67 Standard Deviation 13.5 13.65 Mode 14.50 14.00 Mean Triangular Beta  
  • 9. Sizing Software Projects Effort = (productivity) -1 (size) c productivity ≡ staff-months/ksloc size ≡ ncksloc c is a function of staff skills Staff months Lines of Code or Function Points 500
  • 10. Understanding the equations Consider a transaction project of 38,000 lines of code, what is the shortest time it will take to develop? Module development is about 400 SLOC/staff month Effort = (productivity) -1 (size) c = (1/.400 KSLOC/SM) (38 KSLOC) 1.02 = 2.5 (38) 1.02 ≈ 100 SM Min time = .75 T= (.75)(2.5)(SM) 1/3 ≈ 1.875(100) 1/3 ≈ 1.875 x 4.63 ≈ 9 months
  • 11. How many software engineers? 1 full time staff week = 60 hours, half spent on project (30 hours) 1 student week = 20 hours. Therefore, an estimation of 100 staff months is actually 150 student months. 150 staff months/5 months/semester = 30 student software engineers, therefore simplification is mandatory
  • 12. Lines of Code LOC ≡ Line of Code KLOC ≡ Thousands of LOC KSLOC ≡ Thousands of Source LOC NCSLOC ≡ New or Changed KSLOC
  • 13. Productivity per staff-month: 50 NCSLOC for OS code (or real-time system) 250-500 NCSLOC for intermediary applications (high risk, on-line) 500-1000 NCSLOC for normal applications (low risk, on-line) 10,000 – 20,000 NCSLOC for reused code Reuse note: Sometimes, reusing code that does not provide the exact functionality needed can be achieved by reformatting input/output. This decreases performance but dramatically shortens development time. Bernstein’s rule of thumb for small components
  • 14. “ Productivity” as measured in 2000 3 x code for customized Code for reuse 17 – 105 NCSLOC/sm 1000-2000 NCSLOC/sm New embedded flight software ( customized) Reused Code 244 – 325 NCSLOC/sm Evolutionary or Incremental approaches ( customized) 130 – 195 NCSLOC/sm Classical rates
  • 15. QSE Lambda Protocol Prospectus Measurable Operational Value Prototyping or Modeling sQFD Schedule, Staffing, Quality Estimates ICED-T Trade-off Analysis
  • 16. Universal Software Engineering Equation Reliability (t) = ℮ -k  t when the error rate is constant and where k is a normalizing constant for a software shop and = Complexity/ effectiveness x staffing
  • 17. Post-Release Reliability Growth in Software Products Author: Pankaj Jalote ,Brendan Murphy, Vibhu Saujanya Sharma Guided By: Prof. Lawrence Bernstein Prepared By: Mautik Shah
  • 18. Introduction The failure rate of software products decreases with time, even when there no software changes are being made. This violates our intuition where there is a growth in reliability without any fault removal. Modeling this reliability growth in the initial stages after product release is the focus of this paper.
  • 19. Three possible reasons: Users learn to avoids faults that cause failure and a failure is never random. After Initially exploring many different features and options, users choose a small set of product features, thereby reducing the number of fault carrying paths that are actually exercised. Installing new software onto existing systems often results in versioning and configuration issues which cause failures.
  • 22. Using data from Automated Reporting
  • 23. Product stabilization time Stabilization time indicates the product’s transient defects as well as the user experience. A smaller value of stabilization time means that the end users will have fewer troubles. If the steady state failure rate of a product is acceptable, then instead of investing in system testing the vendor may need to focus on improving issues related to installation, configuration, usage, etc. to reduce stabilization time A high stabilization time will require a different strategy for improving the user experience than is needed for dealing with a high steady state failure rate of a product. .
  • 24. Conclusion Traditional software reliability models generally assume that software reliability is primarily a function of the fault content and remains unchanged if the software is unchanged. But, the failure rate often gets smaller with time, even without any changes being made to the product. T This may be due to users learning to avoid the situations that cause failures, using a limited amount of features functionality or resolving configuration issues, etc. Stabilization time is the time it takes after installation for the failure rate to reach its steady state value. For an organization which plans to have its employees use a software product, the stabilization time could indicate the period after which the organization could expect the production usage of the product.
  • 25. Derivation of Reliability Equation valid after the stabilization intereval. Let T be the stabilization time, then g(T) is some constant failure rate, F. . To convert from a rate to a time function we need to intergrate the Fourier transform: R(t-T) = ∫ g( ω ) exp(- λ (t-T)) from o to ∞, With g(w) is a constant F and τ = t-T R( τ )= F exp(- λτ ) and λ = complexity/effective staffing
  • 26. Function Point (FP) Analysis Useful during requirement phase Substantial data supports the methodology Software skills and project characteristics are accounted for in the Adjusted Function Points FP is technology and project process dependent so that technology changes require recalibration of project models. Converting Unadjusted FPs (UFP) to LOC for a specific language (technology) and then use a model such as COCOMO. (start here)
  • 27. Productivity= f (size) Function Points Bell Laboratories data Capers Jones data Productivity (Function points / staff month)
  • 28. Adjusted Function Points Accounting for Physical System Characteristics Characteristic Rated by System User 0-5 based on “degree of influence” 3 is average Unadjusted Function Points (UFP) General System Characteristics (GSC) X = Adjusted Function Points (AFP) AFP = UFP (0.65 + .01*GSC), note GSC = VAF= TDI Data Communications Distributed Data/Processing Performance Objectives Heavily Used Configuration Transaction Rate On-Line Data Entry End-User Efficiency On-Line Update Complex Processing Reusability Conversion/Installation Ease Operational Ease Multiple Site Use Facilitate Change
  • 29. Function Point Calculations Unadjusted Function Points UFP= 4I + 5O + 4E + 10L + 7F, Where I ≡ Count of input types that are user inputs and change data structures. O ≡ Count of output types E ≡ Count of inquiry types or inputs controlling execution. [think menu selections] L ≡ Count of logical internal files, internal data used by system [think index files; they are group of logically related data entirely within the applications boundary and maintained by external inputs . ] F ≡ Count of interfaces data output or shared with another application Note that the constants in the nominal equation can be calibrated to a specific software product line.
  • 30. Complexity Table 10 7 5 INTERFACES (F) 15 10 7 LOG INT (L) 6 4 3 INQUIRY(E) 7 5 4 OUTPUT(O) 6 4 3 INPUT (I) COMPLEX AVERAGE SIMPLE TYPE:
  • 31. Complexity Factors 1. Problem Domain ___ 2. Architecture Complexity ___ 3. Logic Design -Data ___ 4. Logic Design- Code ___ Total ___ Complexity = Total/4 = _________
  • 32. Problem Domain Measure of Complexity (1 is simple and 5 is complex) All algorithms and calculations are simple. Most algorithms and calculations are simple. Most algorithms and calculations are moderately complex. Some algorithms and calculations are difficult. Many algorithms and calculations are difficult. Score ____
  • 33. Architecture Complexity Measure of Complexity (1 is simple and 5 is complex) 1. Code ported from one known environment to another. Application does not change more than 5%. 2. Architecture follows an existing pattern. Process design is straightforward. No complex hardware/software interfaces. 3. Architecture created from scratch. Process design is straightforward. No complex hardware/software interfaces. 4. Architecture created from scratch. Process design is complex. Complex hardware/software interfaces exist but they are well defined and unchanging. 5. Architecture created from scratch. Process design is complex. Complex hardware/software interfaces are ill defined and changing. Score ____
  • 34. Logic Design -Data Score ____ Simple well defined and unchanging data structures. Shallow inheritance in class structures. No object classes have inheritance greater than 3. Several data element types with straightforward relationships. No object classes have inheritance greater than Multiple data files, complex data relationships, many libraries, large object library. No more than ten percent of the object classes have inheritance greater than three. The number of object classes is less than 1% of the function points Complex data elements, parameter passing module-to-module, complex data relationships and many object classes has inheritance greater than three. A large but stable number of object classes. Complex data elements, parameter passing module-to-module, complex data relationships and many object classes has inheritance greater than three. A large and growing number of object classes. No attempt to normalize data between modules
  • 35. Logic Design- Code Score __ Nonprocedural code (4GL, generated code, screen skeletons). High cohesion. Programs inspected. Module size constrained between 50 and 500 Source Lines of Code (SLOCs). Program skeletons or patterns used. ). High cohesion. Programs inspected. Module size constrained between 50 and 500 SLOCs. Reused modules. Commercial object libraries relied on. High cohesion. Well-structured, small modules with low coupling. Object class methods well focused and generalized. Modules with single entry and exit points. Programs reviewed. Complex but known structure randomly sized modules. Some complex object classes. Error paths unknown. High coupling. Code structure unknown, randomly sized modules, complex object classes and error paths unknown. High coupling.
  • 36. Computing Function Points See http://www.engin.umd.umich.edu/CIS/course.des/cis525/js/f00/artan/functionpoints.htm
  • 37. Adjusted Function Points- review Now account for 14 characteristics on a 6 point scale (0-5) Total Degree of Influence (DI) is sum of scores. DI is converted to a technical complexity factor (TCF) TCF = 0.65 + 0.01DI Adjusted Function Point is computed by FP = UFP X TCF For any language there is a direct mapping from Unadjusted Function Points to LOC Beware function point counting is hard and needs special skills
  • 38. Function Points Qualifiers Based on counting data structures Focus is on-line data base systems Less accurate for WEB applications Even less accurate for Games, finite state machine and algorithm software Not useful for extended machine software and compliers An alternative to NCKSLOC because estimates can be based on requirements and design data.
  • 39. Function Point pros and cons Pros: Language independent Understandable by client Simple modeling Hard to fudge Visible feature creep Cons: Labor intensive Extensive training Inexperience results in inconsistent results Weighted to file manipulation and transactions Systematic error introduced by single person, multiple raters advised
  • 40. Initial Conversion http://www.qsm.com/FPGearing.html 42 Visual Basic 50 J2EE 60 Perl 59 JAVA 42 HTML 53 C++ 104 C Median SLOC/ UFP Language
  • 41.  
  • 42.  
  • 43. SLOC 78 UFP * 53 (C++ )SLOC / UFP = 4,134 SLOC ≈ 4.1 KSLOC . (Reference for SLOC per function point: http://www.qsm.com/FPGearing.html )
  • 44. Expansion Trends Expansion Factor Technology Change: Regression Testing 4GL Small Scale Reuse Machine Instructions High Level Languages Macro Assemblers Database Managers On-Line Dev Prototyping Subsec Time Sharing Object Oriented Programming Large Scale Reuse Order of Magnitude Every Twenty Years Each date is an estimate of widespread use of a software technology The ratio of Source line of code to a machine level line of code
  • 45. Heuristics to do Better Estimates Decompose Work Breakdown Structure to lowest possible level and type of software. Review assumptions with all stakeholders Do your homework - past organizational experience Retain contact with developers Update estimates and track new projections (and warn) Use multiple methods Reuse makes it easier (and more difficult) Use ‘current estimate’ scheme
  • 46. Heuristics to meet aggressive schedules Eliminate features Simplify features & relax specific feature specifications Reduce gold plating Delay some desired functionality to version 2 Deliver functions to integration team incrementally Deliver product in periodic releases
  • 47. Specification for Development Plan Project Feature List Development Process Size Estimates Staff Estimates Schedule Estimates Organization Gantt Chart
  • 48. COCOMO COnstructive COst MOdel Based on Boehm’s analysis of a database of 63 projects - models based on regression analysis of these systems Linked to classic waterfall model Effort is number of Source Lines of Code (SLOC) expressed in thousands of delivered source instructions) - excludes comments and unmodified utility software
  • 49. COCOMO Formula Effort in staff months =a*KDLOC b 1.20 3.6 embedded 1.12 3.0 semi-detached 1.05 2.4 organic b a
  • 50. A Retrospective on the Regression Models They came to similar conclusions: Time: Watson-Felix T = 2.5E 0. 35 COCOMO(organic) T = 2.5E 0. 38 Putnam T = 2.4E 0. 33 Effort: Halstead E = 0.7 KLOC 1.50 Boehm E = 2.4 KLOC 1.05 Watson-Felix E = 5.2 KLOC 0.91
  • 51. Initial Conversion http://www.qsm.com/FPGearing.html 42 Visual Basic 50 J2EE 60 Perl 59 JAVA 42 HTML 53 C++ 104 C Median SLOC/function point Language
  • 52. Delphi Method A group of experts can give a better estimate The Delphi Method: Coordinator provides each expert with spec Experts discuss estimates in initial group meeting Each expert gives estimate in interval format: most likely value and an upper and lower bound Coordinator prepares summary report indicating group and individual estimates Group iterates until consensus
  • 53. Function Point Method External Inputs External Outputs External Inquiries Internal Logical Files External Interface Files External Input External Inquiry External Output Internal Logical Files External Interface File Five key components are identified based on logical user view Application
  • 54. Downside Function Point terms are confusing Too long to learn, need an expert Need too much detailed data Does not reflect the complexity of the application Does not fit with new technologies Takes too much time “ We tried it once”
  • 55. For each component compute a Function Point value based on its make-up and complexity of its data Complexity Record Element Types Data Elements (# of unique data fields) or File Types Referenced Low Average High Low Low Average High Average High Components: Low Avg . High Total Internal Logical File (ILF) __ x 7 __ x 10 __ x 15 ___ External Interface File (EIF) __ x 5 __ x 7 __ x 10 ___ External Input (EI) __ x 3 __ x 4 __ x 6 ___ External Output (EO) __ x 4 __ x 5 __ x 7 ___ External Inquiry (EQ) __ x 3 __ x 4 __ x 6 ___ ___ Total Unadjusted FPs Data Relationships 1 3 3
  • 56. When to Count CORRECTIVE MAINTENANCE PROSPECTUS ACHITECTURE TESTING DELIVERY REQUIREMENTS IMPLEMENTATION SIZING SIZING Change Request Change Request SIZING SIZING SIZING SIZING
  • 57. : Technology (tools, languages, reuse, platforms) Processes including tasks performed, reviews, testing, object oriented Customer/User and Developer skills Environment including locations & office space System type such as information systems; control systems, telecom, real-time, client server, scientific, knowledge-based, web Industry such as automotive, banking, financial, insurance, retail, telecommunications, DoD Estimates vary f{risk factors}
  • 58. Using the equations For a 59 function point project to be written in C++, we need to write 59 x 53 = 3127 SLOC Effort = (productivity) -1 (size) c = [ 1/(.9 x 53 KSLOC/SM)] (3.127 KSLOC) 1.02 = 2.1 (3.127 ) 1.02 = 2.1 (3.127 ) 1 (3.127 ) .02 ≈ 7 SM
  • 59. Baseline current performance levels PERFORMANCE PRODUCTIVITY CAPABILITIES PERFORMANCE SOFTWARE PROCESS IMPROVEMENT TIME TO MARKET EFFORT DEFECTS MANAGEMENT SKILL LEVELS PROCESS TECHNOLOGY PRODUCTIVITY IMPROVEMENT INITIATIVES / BEST PRACTICES RISKS MEASURED BASELINE 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 0 100 200 400 800 1600 3200 6400 Sub Performance Best Practices Industry Averages Organization Baseline
  • 60. Modeling Estimation SIZE REQUIREMENT REQUIREMENT Analyst SELECT MATCHING PROFILE GENERATE ESTIMATE WHAT IF ANALYSIS Counter Project Manager Software PM / User Metrics Database Plan vs. Actual Report Profile Size Time The estimate is based on the best available information. A poor requirements document will result in a poor estimate Accurate estimating is a function of using historical data with an effective estimating process. ESTABLISH PROFILE ACTUALS
  • 61. Establish a baseline Performance Productivity A representative selection of projects is measured Size is expressed in terms of functionality delivered to the user Rate of delivery is a measure of productivity Organizational Baseline 9 Rate of Delivery Function Points per Staff Month 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Software Size
  • 62. Monitoring improvements Track Progress Second year Rate of Delivery Function Points per Person Month 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Software Size
  • 63. Brooks Calling the Shot Do not estimate the whole task by estimating coding and multiplying by 6 or 9! Effort increases as a power of size Unrealistic assumptions about developer’s time - studies show at most 50% of the time is allotted to development Productivity is also related to complexity of the task, more complex, less lines/year - high level languages & reuse critical

Editor's Notes

  • #3: Just differentiating metrics and their uses (and most metrics could be used for both).
  • #30: Function points have their fudge factors too, and most practitioners do not use the unadjusted function point metric.
  • #31: There is also a judgment made on the complexity of each of these domains that were counted.
  • #38: Adjusted function points consider factors similar to those of the advanced COCOMO model. These factors are shown on the next slide. In training, folks should strive for consistency in counting function points … that is, scorers should agree on the counts of the factors, the complexity of each of the counts and the scoring of the characteristics -- this can be achieved but it takes a fair amount of training. The measure of scorers agreeing with each other is often referred to as inter-rater reliability.
  • #39: In other software engineering course at Stevens you will learn to calculate function points. It is a skill that has to be acquired and it does take a while. Function points are currently one of the more popular ways to estimate effort. B&Y stress in heavily in chapter 6.
  • #40: Here’s a listing of the advantages and disadvantages to function points from B&Y pp. 183-4. Function points are certainly more difficult to fudge than SLOC since they address aspects of the application. The other emphasis is on data collection -- you are only as good as your historical data and if you use these techniques extensively you should endeavor to continue to collect data and tune the metrics from experience.
  • #41: For completeness and to provide you with a feel for the degree of effort a function point represents here’s another table mapping several computer languages to function points.