Recap: last week‘s material
Last week, we covered the main zest of software engineering: what it is about and how we should utilize it to produce generic products and custom ones. Evolution of design techniques along with the core principles of software practice were covered as well.
SOFTWARE DEVELOPMENT LIFE CYCLE (SDLC)
SDLC is a process followed for a software project that is made of a detailed plan describing how to develop, maintain, replace, and alter or enhance specific software. Simply put, it is a methodology for improving software quality and overall development process. It involves six stages:
LIFE CYCLE MODELS
Life cycle models aka process models come in handy when developing a software product. Generally, a life cycle model represents all the activities needed to make a software product transit through its life cycle phases and defines entry and exit criteria for every phase. Software project managers use life cycle models to monitor the progress of a particular project, assign roles to reflect upon the responsibilities of the people involved, set pre- and post-conditions, which are similar to before & after statements for a process activity. It is also important to keep in mind that in practice, most practical processes involve elements of both plan-driven (planning is done in advance and change is not possible) and agile approaches (planning is incremental and change is possible).
Classical Waterfall Model
This model is considered as the basic blocks for all which other life cycle models were built upon. However, the classical waterfall model is no longer used in practical development projects, since it doesn’t support change to handle errors committed during any of the phases (as seen in the image on the right). Today, this model is regarded as the “an anti-pattern” that shouldn’t be followed.
Iterative Waterfall Model
The basic idea behind this method is to develop a system through repeated cycles (iterative) and in smaller portions at a time (incremental). The iterative waterfall model is probably the most widely used software development model so far since corrective measures can be taken through feedback loops at most stages. Generally, rigorous validation of requirements and verification & testing of each version of the software is needed for best results. However, this model is only suitable for well-understood problems that aren’t subject to many risks. From the image on the left, you will notice that several iterations are needed to enhance evolving versions until the full system is implemented. At each iteration, design modifications are introduced along with added functional capabilities.
Prototyping Model
The prototyping model is used when either the user requirements or the underlying technical aspects are not well understood. It is popular for the development of the user-interface part of projects.
A prototype is usually low in reliability with limited functional capabilities and inefficient performance compared to the actual software since it is built using several shortcuts and dummy functions. It only looks like the actual system but doesn’t function like it as the actual computations aren’t performed. As seen from the image on the right, this model follows a similar structure as that of classical waterfall model.
Spiral Model
This model is also called a meta model as it includes all other life cycle models, like Risk handling. The spiral model is used for developing technically challenging software products that are prone to several kinds of risks. It combines the idea of iterative development with the systematic. In other words, it is like the iterative waterfall model but with very high emphasis on risk analysis.
The diagrammatic representation can be seen on the left; it appears like a spiral with many loops. The exact number of loops is not fixed as each loop represents a phase of the software process. To most people, this model is much more complex than the other models, which is why it isn’t used for ordinary projects.
Checklist aka Things you should [kinda] know after reading this post:
References:
ITT KHARAGPUR. Software Engineering. Retrieved from nptel.ac.in.
Sommerville, Ian. Software Engineering. Retrieved from SlideShare.
Tutorials Point. SDLC. Retrieved from tutorialspoint.com/sdlc.
Background Story: Hey, I’m Yasmin, a soon-to-be junior CS Major at UoPeople with 80 credit hours down. I will be using my blog as a platform to help me study and share what I learn on a weekly basis with my fellow classmates and readers. Follow the Studying CS at UoPeople section for weekly blog posts published every Friday Sunday. Please note, this series will not interfer with the weekly Monday posts. Thanks for reading. Cheers!
Recap: last week‘s material
Last week, we covered the main zest of data structures: what it is about and how we should utilize it to solve problems. Design patterns were briefly touched upon and good attributes any algorithm should have were covered as well.
ASYMPTOTIC ANALYSIS
Asymptotic Analysis is an estimation technique that measures the efficiency of an algorithm, or its implementation as the input size of a program increases. Generally, you will need to analyze the time required for an algorithm and the space required for a data structure. Here, is [The Big Five] which are a bunch of functions I will be elaborating on shortly: (I thought I’d mention them before I start as they are practically what this whole section is about; ps. Ignore the stuff in red for now)
Growth rate
The growth rate for an algorithm is the rate at which the cost of the algorithm grows as the size of its input grows. Checkout the table below (obtained from Princeton.edu)
Complexity | Description |
1 – | Constant growth does not depend on the input size. |
log N – | Logarithmic growth gets slightly slower as N grows. |
N – | Linear growth is optimal if you need to process N inputs. As N doubles, so does time. |
N log N – | Linearithmic growth scales to huge problems. As N doubles, time more than doubles. |
N^2 – | Quadratic growth practical for use only on relatively small problems. As N doubles, time increases four-fold. |
N^3 – | Cubic growth is practical for use on only small problems. As N doubles, time increases eight-fold. |
2^N – | Exponential growth is not appropriate for practical use. As N doubles, time squares. |
N! – | Factorial growth is worse than exponential. As N increases by 1, time increases by a factor of N. |
Upper & Lower bounds + Θ notation
Various terms describe the running-time equation of an algorithm. These terms along with their associated symbols, indicate what aspect of the algorithm’s behavior is being described. One is the upper bound and another is the lower bound. When describing the upper bound of an algorithm, our statement should be about some class of inputs of size n on either the best-case, average-case, or worst-case scenario. An appropriate description would be: “this algorithm has an upper bound to its growth rate of n^2 in the average case.” However, the phrase “this algorithm has an upper bound to its growth rate of f(n)” is too long and is generally replaced with the big-Oh notation. The previous statement is now: “O(n^2) in the average case”. Keep in mind that, the big-Oh notation gives us the running time of an algorithm in the worst case scenario.
The lower bound is the exact opposite of big-Oh, it is the lowest amount of some resource (usually time) that is required by an algorithm for some class of inputs of size n. The lower bound for an algorithm is denoted by the symbol Ω and is pronounced “big-Omega” or just “Omega.” On the other hand, when both the upper and lower bounds are the same within a constant factor, we indicate this by using Θ (big-Theta) notation. Lastly, in time complexity analysis we should always try to find the tight bound aka Θ (big-Theta) notation as it provides the best idea of the time taken.
o(g(n)) = f(n) ≡ ∀c > 0, ∃n0 > 0 ∋∀n ≥ n0, 0 ≤ f(n) ≤ cg(n) //not important, just 4 ur info
Translation in layman’s terms: the set of functions o(g(n)) [pronounced as little-oh of g(n)] is defined to be the set of functions f(n) such that ‘n’ falls in the following conditions: There is any positive constant and some ‘no’ exists such that for all ‘n’ is bigger than ‘n0’, f(n) is no bigger than the constant times g(n). In other words, f(n) < g(n).
O(g(n)) = f(n) ≡ ∃c > 0, n0 > 0 ∋∀n ≥ n0, 0 ≤ f(n) ≤ cg(n)
Translation in layman’s terms: the set of functions O(g(n)) is defined to be the set of functions f(n) such that ‘n’ falls in the following conditions: There exists some positive constant and some ‘no’ such that for all ‘n’ is bigger than ‘no’, f(n) is no bigger than the constant times g(n). In other words, f(n) ≤ g(n).
Θ(g(n)) =f(n) ≡ f(n) = O(g(n)) and f(n) = Ω(g(n))
This is straight forward, the set of functions Θ(g(n)) is defined to be the set of functions f(n) when both the upper and lower bounds are the same within a constant factor. In other words, f(n) = g(n).
Ω(g(n)) = f(n) ≡ ∃c > 0, n0 > 0 ∋∀n ≥ n0, 0 ≤ cg(n) ≤ f(n)
Translation in layman’s terms: the set of functions Ω(g(n)) is defined to be the set of functions f(n) such that ‘n’ falls in the following conditions: There exists some positive constant and some ‘no’ such that for all ‘n’ is bigger than ‘no’, f(n) is at least as big as the constant times g(n). In other words, f(n) ≥ g(n).
ω(g(n)) = f(n) ≡ ∀c > 0, ∃n0 > 0 ∋∀n ≥ n0, 0 ≤ cg(n) ≤ f(n) //not important, just 4 ur info
Translation in layman’s terms: the set of functions ω(g(n)) [pronounced as little-omega of g(n)] is defined to be the set of functions f(n) such that n falls in the following conditions: There is any positive constant and some ‘no’ exists such that for all ‘n’ is bigger than ‘no’, f(n) is at least as big as the constant times g(n). In other words, f(n) > g(n).
ALGORITHM PERFORMANCE
The primary consideration when estimating an algorithm’s performance is the number of basic operations required by the algorithm to process an input of a certain size (the number of inputs processed). Keep in mind that a basic operation must have the property that its time to complete does not depend on the particular values of its operands. Basic operations include: evaluating an expression, assigning a value to a variable, indexing into an array, calling a method, returning from a method, etc.
Generally, an algorithm may run faster on some inputs than it does on others of the same size. We might be eager to just average everything up and move on but that usually doesn’t work. An average case analysis requires that we understand how the actual inputs to the program (and their costs) are distributed with respect to the set of all possible inputs to the program.
Programmers usually stick with the worst case analysis because it is an indication that the algorithm must perform at least that well. Sticking with the average or best case is usually risky. This case is easier than an average case analysis since it requires only the ability to identify the worst case input, ultimately leading to better algorithms. The best case analysis on the other hand, is used to describe the way an algorithm behaves under optimal conditions (not very realistic).
RUNNING TIME OF AN ALGORITHM
Generally, the running time of any algorithm depends upon a bunch of factors. Like, whether your machine is a single processor machine or a multiple processor machine. The generation of the machine also counts and whether it is 32-bit or 64-bit. Another factor is the read or write speed onto the memory/disk of your machine. Lastly, it also depends upon the kind of input you are giving to your program, which is the main factor. All you should be concerned about is how your program behaves for various inputs or how the time taken by the program grows. Hence, we are interested in the rate of growth taken with respect to the input.
In order to estimate the running time of an algorithm you need to read each line of your given code segment and assign the cost for each operation and mention the number of times that particular operation will repeat. You will see what I mean, from the example on Matrix Multiplication:
Algorithm
int n = A.length; |
Cost ×
c0 |
# of Repetition
× 1 time |
Total Number of operations: |
Time Complexity Analysis – General Rules
We analyze time complexity in most cases for very large input-size and worst case scenario. For instance, if you want to get the big Oh notation for a function, let’s say a polynomial. Rule 1: you will need to drop all the lower terms. Rule 2: you will need to drop the constant multiplier as well. Example: T(n) = 17n^4 + 3n^3 + 4n + 8, all terms will be insignificant for higher values of n; hence, your big Oh is O(n^4). The same thing applies with ‘log’. For instance, T(n) = 16n + log n = O(n).
Moreover, you can calculate the running time of an algorithm by summing up the running times of the fragments of the program; which brings us to Rule 3: Running time = ∑ running time of all fragments. Example:
int a; a=5 a++;Simple statements fragment 1 = O(1) |
for (i=o; i < n; i++) { Single loop |
for (i=o; i < n; i++) { for (j=o; j < n; j++) { //Simple statements }}Nested loop Fragment 3 = O(n^2) |
Now, just add all the fragments. T(n) = O(1) +O(n) + O(n^2) = O(n^2) |
However, when we have some conditional statements, like an if and else control structure. One fragment of the code has a particular growth rate of O(n) for instance, while another fragment has a growth rate of O(n^2). Will it still be valid to sum all fragments? Example:
Explanation: When we have some conditional statements, the program goes like: if some condition is true then we have a single loop with a time complexity of O(n) and if the condition is not true, then we will have a nested double loop with a time complexity of O(n^2). Now if the control of the program goes to the first part, then it will execute with O(n); but, if the control goes to the else part, then it will execute with O(n^2). Regardless of which part executes, we always follow the worst case scenario, which is T(n) = O(n^2) in this example. Hence, in the case of conditional statements, you don’t just simply add the fragments, but pickup the maximum of the two. Rule 4: For conditional statements, pick the complexity of the condition which is the worst case.
You have reached the end of this blog post. Thanks for reading, I will leave you now with some recurrence relations to remember, have a great day!
Recurrence Relations | ||
Recurrence | Algorithm | Big-Oh Solution |
T(n) = T(n/2) + O(1) | Binary Search | O(log n) |
T(n) = T(n-1) + O(1) | Sequential Search | O(n) |
T(n) = 2*T(n/2) + O(1) | Tree Traversal | O(n) |
T(n) = T(n-1) + O(n) | Selection Sort | O(n^2) |
T(n) = 2*T(n/2) + O(n) | Merge Sort | O(n log n) |
Checklist aka Things you should [kinda] know after reading this post:
References:
[Textbook] Shaffer C.A (2011). A Practical Introduction to Data Structure and Algorithm analysis. Retrieved from http://courses.cs.vt.edu/cs3114/Spring09/book.pdf
[Videos] Asymptotic Definitions Part 1 by David Taylor | Asymptotic Definitions Part 2 by David Taylor | Algorithm Analysis for Different Control Structures by S. Saurabh | Big-O notation by Jamal Thorne | Time complexity analysis by MyCodeSchool
[Webpages] Analysis of Algorithms (Java) | Analysis of Algorithms by Daisy Tang
[Lecture Slides] Asymptotic Running Time of Algorithms
Background Story: Hey, I’m Yasmin, a soon-to-be junior CS Major at UoPeople with 80 credit hours down. I will be using my blog as a platform to help me study and share what I learn on a weekly basis with my fellow classmates and readers. Follow the Studying CS at UoPeople section for weekly blog posts published every Friday Saturday. Please note, this series will not interfere with the weekly Monday posts. Thanks for reading. Cheers!
Introduction to Software Engineering
Software costs dominate computer system costs, which is why [cost-effective] software development with good attributes like maintainability, dependability & security, efficiency, and acceptability is important. Hence, it is good to remember that reusing old software when appropriate is preferable than writing new software from scratch. Software products come in two forms: Generic products, which are stand-alone systems sold to the public with rights fully owned by the developer and customized products, which are software commissioned by a specific client, who will be granted sole rights to the software. Generally, software engineering involves 4 activities:
On a different note, one thing I noticed is that rookie programmers tend to interchange the concept of a program and a software unintentionally. Before taking this unit, I too was guilty of that; mainly because I didn’t know the different between the two:
PROGRAM VS. SOFTWARE PRODUCT | |
Program | Software |
– Small in size with limited functionality. | – Extremely large with a lot of features. |
– Programmer himself is the sole user. | – Most users aren’t involved in development. |
– Single developer aka the programmer. | – Large number of developers involved. |
– User interface is not very significant. | – User interface is carefully implemented. |
– Very little or no documentation. | – Well documented. |
– Following programmers’ individual style. | – Following software engineering principles. |
Software engineering reduces programming complexity via following two techniques: abstraction and decomposition. The main purpose of abstraction is to omit irrelevant details in order to focus on relevant ones and suppress aspects that are not relevant for the given purpose. Once the simpler problem is solved, then the omitted details are taken into consideration to solve the next lower level abstraction. On the other hand, decomposition focuses on dividing a complex problem into several smaller parts and then solving each smaller part one by one. A good decomposition of a problem is said to minimize interactions among various components.
One more thing to keep in mind is that software engineering ethics is more than just the mere application of technical skills, but rather the practice of principles that are morally correct, such as: respecting confidentiality, being competent, adhering to intellectual property rights, and correctly using computers. You may checkout the ACM/IEEE code of ethics for more information.
Challenges & Evolution
The main challenge here, lies in the current software crisis which is driven by the increased expenditure on software products. Today, organizations spend larger portions of their budget on software, and yet are faced with problems like:
These problems can only be battled with spreading awareness of software engineering practices among engineers, along with more advancements within the software engineering discipline itself. However, doing something about the factors contributing to this crisis is also important. Those factors are larger problem sizes, lack of adequate training in software engineering, increasing skill shortage, and low productivity improvements.
Like what Søren Kierkegaard once said, “Life can only be understood backwards; but it must be lived forwards.” The same concept applies to software engineering. It is important to know past failures, successes, and major milestones in order to utilize current technologies for the betterment of the software engineering discipline.
EVOLUTION OF SOFTWARE DESIGN TECHNIQUES | |
1950s | Most programs were written in assembly language, where programmers followed exploratory programming, which was programming based on a developers’ individual style or intuition. |
Early 1960s | Languages like FORTRAN, ALGOL, and COBOL were introduced. Exploratory programming style was starting to be insufficient as programmers had a hard time writing cost-effective and correct programs and understanding & maintaining programs written by others. |
Late 1960s |
It was found that the “GOTO” statement was harmful and the main reason why the control structure of a program is complicated and messy. Later structured programming was discovered; it uses 3 types of program constructs: selection, sequence, and iteration. |
Late 1970s | Data structure-oriented design became the next hit, where programmers were encouraged to pay more attention to the design of a programs’ data structure rather than to the design of its control structure. |
1980s | Object-oriented design was introduced and is now the latest and most widely used technique. |
Lastly, I am pretty much done. Please bear in mind, that this entire post serves as an introduction to the unit. I will leave you now with the core principles of software practice, have a great day!
Checklist aka Things you should [kinda] know after reading this post:
References:
ITT KHARAGPUR. Software Engineering. Retrieved from nptel.ac.in.
Sommerville, Ian. Software Engineering. Retrieved from SlideShare.
Background Story: Hey, I’m Yasmin, a soon-to-be junior CS Major at UoPeople with 80 credit hours down. I will be using my blog as a platform to help me study and share what I learn on a weekly basis with my fellow classmates and readers. Follow the Studying CS at UoPeople section for weekly blog posts published every Friday. Please note, this series will not interfere with the weekly Monday posts. Thanks for reading. Cheers!
Introduction to Data Structures
The fundamental role of most (if not all) computer programs is to store and retrieve information as quickly as possible. As beginner coders we usually focus on our programs performing calculations correctly, while neglecting speed and information retrieval or storage. In that regards, this unit will teach us how to structure information to support efficient processing. But before we get to that, we need to mentally prepare ourselves for three things (that we’ll get the hang of in the future):
**Those three things aka “KAM” which means “how much in Arabic” is what I’d like to know after finishing this course. (that acronym is mainly something to help me remember those 3 things, feel free to ignore it)
Design Patterns
Data structures is about designing algorithms that make [efficient] use of a computer’s resources. Generally, a design pattern embodies vital design concepts for a recurring problem. In other words, a specific design pattern will appear from the discovery that a particular design problem occurs frequently in various contexts. One thing you should keep in mind is that a design pattern describes the structure for a design solution. It comes with both costs and benefits, where the concept of trade-offs kicks in.
Trade-offs is something like preferring or choosing one thing over another in order to gain optimum results within the set constraints. Given that, any design pattern might have variations upon its application to match the many trade-offs in a given situation. Hence, it is important to follow certain steps when selecting a data structure to solve a problem:
So, without further adieu, here are 4 example design patterns (that I won’t be elaborating on):
Flyweight Pattern: Mainly comes in handy when you have an application with many objects, where some of these objects are basically the same in terms of information and role. Yet, they have to be reachable from various places, while being conceptually distinct objects. On a side note, utilize the concept of reducing memory cost by sharing space since a lot of information is being shared.
Visitor Pattern: This pattern has to do with trees. Have you heard of tree traversal? Lets say, you got a tree of objects to describe a page layout and you want to perform some activity on each node in the tree. You will need to have some kind of order without changing associated elements. Visitor is basically the process of visiting every node in the tree in a [defined order]. In a nutshell, it represents the operation that will be performed on the elements of an object structure.
Composite Pattern: Is all about hierarchy of object types and a bunch of actions. It has to do with subclass hierarchy defining specific sub-types (I know, not the best definition, but you get the idea).
Strategy Pattern: In layman’s terms, this pattern is about encapsulating an activity that is part of a larger process.
Algorithms and Problem Solving
A problem, as we all may know from having attended math class, is something that needs to be solved. And an algorithm in this case, is the method or process followed to get the solution. Here are some facts, you need to know before I get to the point: any modern computer programming language can be used to implement a bunch of algorithms. An algorithm by default should provide enough detail that it can be converted into a program whenever necessary. Having said the facts, I will move onto the properties any algorithm should possess in order to solve a particular problem.
On a different note, this unit is definitely a brain whacker! Okay, now that we have established that the unit is hard, please checkout Jeliot, it is an important software for UoPeople students taking this course and maybe for other students too, you can download it here. I am pretty much done, below is a summary of the things covered. Please bear in mind, that this entire post serves as an introduction to the unit.
Checklist aka Things you should [kinda] know after reading this post:
References:
Shaffer C.A (2011). A Practical Introduction to Data Structure and Algorithm analysis. Retrieved from http://courses.cs.vt.edu/cs3114/Spring09/book.pdf
Background Story: Hey, I’m Yasmin, a soon-to-be junior CS Major at UoPeople with 80 credit hours down. I will be using my blog as a platform to help me study and share what I learn on a weekly basis with my fellow classmates and readers. Follow the Studying CS at UoPeople section for weekly blog posts published every Friday. Please note, this series will not interfere with the weekly Monday posts. Thanks for reading. Cheers!