Friday, October 30, 2009

Just how frequently is 'refactor frequently'?


I was asked what I meant by refactor frequently on the guidelines and could not locate a good answer to that... to me the question was wrong. When should not you refactor is more along the lines of my thinking. Not that I think there's a great answer to that either, but then I did skip over the google results that implicated when to stop refactoring or when not to refactor while searching for these quotes.

I tend to refactor with support from the DRY principle of as soon as you notice something is redundant or repeated. I refactor either on the first noticed redundancy or the 2nd when I'm primarily coding, when I'm reviewing, I try to refactor any redundancy I find. I also refactor based on the other Code Smells. I will also refactor when I notice violations of SOLIDDomain Driven Design (DDD for short), and now I've started to think about Command-Query Seperation (CQS).

When I see code that looks ugly but fits these I'll shoot for refactoring towards a Fluent Interface  with Method Chaining

I think I already do command query separation but I haven't been looking for it in particular.

And here's the email I sent out to the team, it appears that what I just wrote here probably should been included at the top... oh well. A little context, the email was generated in response to a meeting on the coding standards I put forth.

... asked me for guidelines in our meeting about how frequent is ‘refactor frequently’? I had difficultly answering the question and I think everyone here was aware of that. I hope to address this question with some support from the blogosphere/industry.


·         Before any redesign
o   Helps you get familiar or re-familiarize yourself with the existing code
o   Helps solidify the existing design for the addition of new features.
·         Before any new project that builds on top of the old one
·         Before adding new features to a stable build
·         Semi-continuous refactoring
o   Code a little, test a little, refactor a little.
·         Make it your goal to leave the code in better shape at the end of the day than at the beginning

Ideas from Elliotte Rusty Harold – from The Cafes – When to Refactor author of the book Refactoring HTML

Refactoring throughout the entire project life cycle saves time and increases quality.

Refactor mercilessly to keep the design simple as you go and to avoid needless clutter and complexity. Keep your code clean and concise so it is easier to understand, modify, and extend. Make sure everything is expressed once and only once. In the end it takes less time to produce a system that is well groomed.


When?

The sooner the better as is easier, faster and less risky to refactor over a recently refactored code rather than waiting to refactor for the code to be almost completed.


What is refactoring and why?

XP Practices: Refactoring

Refactoring is the process of clarifying and simplifying the design of existing code, without changing its behavior. Un-refactored code tends to rot. Rot takes several forms: unhealthy dependencies between classes or packages, bad allocation of class responsibilities, duplicate code, and many other varieties of confusion and clutter. (Check out this list of such "design smells.")

Rot is what makes code difficult to maintain or extend. Every time we change code without refactoring it, rot worsens and spreads. Code rot frustrates us, costs us time, and unduly shortens the lifespan of useful systems.

Refactoring code ruthlessly prevents rot, by keeping code easy to maintain and extend. This extensibility is the reason to refactor and the measure of its success. This is what enables XP teams to embrace arbitrary and drastic change. Note that the XP practice of Test-driven Development (TDD) is essential to refactoring. The exhaustive tests produced by TDD are what make it safe and orderly to make changes of any kind. This is why Adaption always teaches TDD and refactoring together.
Code Hygiene

What does "refactor ruthlessly" mean? It means striving as a matter of routine to keep the code's design simple and crystal clear. It means knowing the design principles and patterns that are vital to keeping code extensible, and knowing when to "refactor toward" them. It means refactoring both production code and test code frequently during the day, eliminating all forms of the "code smells" that are precursors to true rot.

Mainly it means never going home at the end of the day with "code debts" that need paying tomorrow (smelly sections that need cleaning up). This level of code hygiene may at first seem like a lot of extra work, but it pays you such dividends so soon and so regularly that you soon become addicted to it -- rather like TDD.

From adaption

Web coding practice ideas

When developing a website some of the technology you may want to practice to be ready, or technology you may want to consider for the web site you are building. These are things I'm blogging to give a potential client ideas for their shiny new business web site.

  • RSS feed (of sales, new products, etc..)
  • Login - enables other features
    • Use OpenId?
  • Shopping cart
    • can be done without a login
    • better done with
    • Can be done via amazon.com I believe
  • Product catalog
    • Advanced search capabilities
      • Hierarchial
    • May be able to integrate with amazon.com
  • Comments
    • Captcha
    • Usages
      • About business
      • About product
      • About web site
  • WishList
  • Sharing links
    • Buttons that add this page to a user's Delicious.com, StumbleUpon, or Digg, etc..
  • Breadcrumbs
    • Track a user's path through the site
  • Error Log
    • helps to watch for site problems, host problems, etc..
    • Could be web based
  • Gadgets
    • Meebo chat
      • Allow users to chat with you real time via your web page
    • Google voice calling
      • Allow users to call you (includes voicemail) via a link on your page
  • Blogroll
    • Links to articles that may interest your customers
    • Help drive up search engine ranking
  • Meta tuning
    • Helps push your site up on the search engines

Thursday, October 29, 2009

Linq-To-Sql IsDbGenerated=true

Do you ever make changes to your database schema that's being mapped via dbml files? As far as I know the easiest way to update the dbml to match, is to delete the related tables from it, and then pull them back in from server explorer.

What do you lose when this happens? Any custom names you have given a foreign key relationship, Auto Generated=true (column attribute[Column(IsDbGenerated=true]) on fields that should use their default value like AddedDt columns.

So I haven't come up with a Design/compile time solution, but here's a run time solution I've come up with:

public partial class LqDataContext : System.Data.Linq.DataContext
{
    
partial void OnCreated()
    {
        
Debug.Assert(Linq.GetColumnAttribute<blogMain>(item => item.addedDt).IsDbGenerated );
    }
}


So if your program tries to create an instance of the dataContext an assertion fails. You could easily put this as one of your unit tests so that you are sure to check it before release. This makes use of the Member class and this method for getting the column attributes from a linq table:

    /// <summary>
        /// Uses an extra parameter to help with design-time and compile-time literal
        /// doesn't as easily get out of sync with the db
        /// Usage example:
        /// var columnInfo = Linq.GetColumnAttribute(Member.Name<zApplication>(item=>item.lastUser));
        /// </summary>
        /// <typeparam name="TEntity">Should be your linq entity</typeparam>
        /// <param name="fieldName"></param>
        /// <returns></returns>
        public static ColumnAttribute GetColumnAttribute<TEntity>(string fieldName)
        {
            System.Reflection.
PropertyInfo prop = typeof(TEntity).GetProperty(fieldName);
            
var info = prop.GetCustomAttributes(typeof(ColumnAttribute), true);
            
if (info != null && info.Length == 1)
            {
                
return (ColumnAttribute)info[0];
            }
            
return null;
        }


        
/// <summary>
        /// Uses an extra parameter to help with design-time and compile-time literal
        /// doesn't as easily get out of sync with the db
        /// Usage example:
        /// var columnInfo = Linq.GetColumnAttribute<zApplication>(item=>item.lastuser);         /// </summary>
        public static ColumnAttribute GetColumnAttribute<TEntity>(Expression<Action<TEntity>> fieldNameExpression)
        {
            
var fieldName = Member.Name<TEntity>(fieldNameExpression);
            
return GetColumnAttribute<TEntity>(fieldName);
        }

        
/// <summary>
        /// Uses an extra parameter to help with design-time and compile-time literal
        /// doesn't as easily get out of sync with the db
        /// Usage example:
        /// var columnInfo = Linq.GetColumnAttribute<zApplication>(item=>item.lastuser);
         /// </summary>
        public static ColumnAttribute GetColumnAttribute<TEntity>(Expression<Func<TEntity,object>> fieldNameExpression)
        {
            
var fieldName = Member.Name<TEntity>(fieldNameExpression);
            
return GetColumnAttribute<TEntity>(fieldName);
        }

Wednesday, October 28, 2009

Project standards


This was an email I sent to my team today on some project standards



This should probably be a wiki, or a word document on SharePoint but I’d like to get the ball rolling for a list of ‘standard features’ we want to require/design for in all our applications.

I’m using the term business project to refer to anything related to a given assignment , can anyone think of a better term that would encapsulate the possibility of multiple solution files, multiple visual studio projects, documentation, images, etc..

I think of BLib as a business project which would encapsulate BLibTester, BLib.dll, BLib.vbproj, etc..
This idea should make sense in light of the recent folder structure discussion where you might have multiple solution files.

Also your (UI, database/persistence, and Business Logic/app Domain).

·         Multiple projects
o   business projects should be at least broken into 3 pieces
§  UI
§  Database/persistence
§  Business Logic/App Domain (should have no dependencies on the other layers)
§  Other Possible layers:
·         Controller (UI logic that would be consistent between multiple UI types)
·         Testing (this could be a separate test layer, or integrated into the projects themselves, not sure where this would best be placed?)
o   Benefits
§  If the dependencies are done right this greatly reduces the number of changes needed to swap out parts
·          from one UI type to another
o   Asp.net Web forms
o   Asp.net MVC
o   SilverLight
o   Windows Forms
o   Windows Workflow
·         From one persistence layer to another
o   SQL server
o   Mainframe
o   ISeries
o   Teradata
§  Readability/Maintenance – if the UI is malfunctioning check the UI project first, if the business logic needs updating, go to the business logic section, etc..
·         Exception logging
o   All application should have an error log, or communications log
o   Blog for most apps, and most log data
§  Some data may be more appropriately held in the repository, or a local data store on the user’s machine
o   Benefits
§  Full logging system with automatic repository fallback is written in BLib.dll
§  Anyone of us can view log information when a program is malfunctioning to speed up the problem location, and possible functions in the code that were involved.
·         About section (help->about in most apps)
o   Should include detailed dependency information ( a list of all Dlls, their file version, assembly version, and modified date)
o   Benefits
§  Anyone can look at the application and see versioning information easily without leaving the program
§  Anyone of us can check the version number of an application with a problem and see if it’s the latest version
§  Anyone of us can look at the dependencies and see if the current problem could be the result of out-dated references
·         Business project folder structure consistency
o   All solutions and related resources (documentation, images shared between multiple projects in the business project
o   All visual studio projects (.vbproj, .csproj, etc..) should reside in subfolders in this folder
o   Benefits
§  Consistency greatly improves maintainability, everyone knows where to look on your machine if they sit at your desk to help or just beta test your program for you.
§  Pulling down the entire business project, solution, or just VS project from the version control repository would come down in a consistent layout between projects.
·         Version Controlled
o   All projects/solutions should be version controlled
o   Benefits
§  Anyone of us can pull down the latest source code and review it.
§  Breaking changes can be very easily undone.
§  Multiple levels of undo across months of work instead of just since your last save to disk.
§  You can pull a library into your code and debug your code alongside the library without fear of an auto build saving that out to the main copy
§  I make some important changes to BLib or BDartLib, and don’t save them, there’s a bug in BDartLib or a change to the mainframe, someone else needs to fix it while I’m out of the office, when I come back into the office, I get to compare and merge all 3 change sets(the previous commit, my changes, and your changes) nothing is lost.
§  Automatic change log, all changes can be seen for each commit, so please commit often.
·         Common library references to the latest Z drive version
o   References should point to the Z drive, not an old project or old dll that was copied from before
o   Code should be updated whenever you are opening it for a recompile or change to match the new libraries
o   Benefits
§  Instead of digging for an old copy of the project we can use the newest one
§  A system-wide change (think mainframe field movement) would require a change to the single newest-code library and then solutions that depend on it recompiled
§  A systematic change (for example if all libraries have had a bug that wasn’t noticed) would require a change to the single newest-code library and then solutions that depend on it recompiled
§  New features requests (think multiple projects need to show a new field) would not require double or more efforts to update both the newest library, then the older copies, then testing for each, separate recompiling of the dependent solutions for each
·         Pivotal Tracker Tracked
o   Features, chores, bugs should be entered in Pivotal Tracker for all projects
o   Benefits
§  The ticketing system will be able to create tickets in it
§  Any ideas for the project that got pushed aside don’t get forgotten
§  Historical feature, chore, and bug tracking
§  Historical comment/discussion preservation
§  New team mates can get up to speed on a project more easily (we can hope for new team mates in the coming year right?)
·         Last use and last user tracking
o   Benefits
§  At a glance we could see the last time a business project or specific version/type of business project was used.
§  Already designed for .net3.5 apps
§  We could easily spot users that are continuing to use old versions or whose automatic updates has stopped functioning
§  If audit tracking is also added, we could see frequency
·         Test Mode – useful? Worth the large development costs?
o   A test mode enables a tester or developer to click around in the application to their hearts content without being afraid something will be permanently changed(mainframe saves, sql server updates/deletes, etc..)
o   Drawbacks: difficult to implement
o   Benefits: a tester, or developer can fire up the application, place it in test mode and get familiar with it by using it instead of trying to read all the code.

Can anyone think of anymore? Thoughts on any of these?

Application architecture


Based on the 1st chapter of the book Dependency Injection in .Net it appears a more usable dependency hierarchy of an application is that the middle tier is the master with no dependencies on the User-Interface or Data layers. This is accomplished via interfaces(or abstract classes). So the user-interface should have no dependencies besides on the middle tier(business logic/Domain library). And the Data layer(s) should depend on the business logic tier.

The reason for this is that if the database itself changes form (we move from one sql server to another, or change the database design, or the data feed moves to ISeries instead of a mainframe, or to teradata instead of a mainframe) the only thing that needs to change is the data Layer. If we change the UI around (windows forms to web) the only thing that has to change should be the UI layer).  The middle tier(business logic/Domain library) should be the absolute heart of the application.

Another major reason for this is if the business logic needs to be changed the middle tier having no other dependencies should make it safer to make that change without worry for what the other moving parts are going to do.

The goal being to move this spaghetti type code:

bad dependencies

To this clean structured layout where the UI or data access sections can be swapped out as needed:

good dependencies

The first chapter of the book is available online for free, and is where I read this information. The examples in it are fairly well written.

Coding standards ideas from the past

These are the updates/revisions I made for my team when asked again in June:


Non-production warnings:
                An application should produce a warning on the blog, and by a net send to itself that it is in testing or non-production mode.  Otherwise it could run and not produce any results without anyone noticing.

Major modes/flags:
                An automation application should report all major flag settings it is running under to the blog for that run. (non-production, holiday run, any other major setting that could be set incorrectly for a particular situation.)

Net Sends:
                Net sends should be used to report failures and crashes instead of successes since we only need to do something if there is a failure or crash. Reporting successes tends to make us get used to ignoring net sends.  Also they should be used VERY sparingly, some code for a situation I never expected (all programs/automation net sending me EVERY time they could not use statsout for the blog, resulted in over 1200 net send messages being sent to me over 3 days.)

Exceptions:
                Throwing exceptions is wasteful, code should be designed in such a way that each part of the path from where the exception would be thrown to where it would be handled can be done without throwing an exception as there is overhead there and it violates the definition of Structured Programming

Return,exit for, exit statements:
                As a rule having more than one return statement in the same function violates the definition of structured programming.
                Exit for to a lesser extent is more of the same violation

Version Numbering (revised):
                Your build should include year month day in that order, but it should be in such a way that a straight string comparison puts the version numbers in the correct order. (you need a non-zero placeholder in front of month and day, or only month/day if they are in the same field) This ensures the ability to easily tell which is the newest version of a project.  Modification dates alone do not account for this because an old version could be run again, and produce a new modification date without actually being the latest changes/version.

Version Control Commits:
                After doing your revisions to a project, if they involve multiple new features, consider committing the individual files with the notes of what features have been added/revised.  This is easily obtained by right clicking the file and selecting show changes.

Change Log (revised):
                With version control, our change log should still include features added, modified, or revised so that end users can see the information. It should be less technical than what is in the version control, but features and functionality changed should still have change log information. Consider updating the change log, then copying that data and just using directly with your commit.

Pre/post build events (revised)
                Nothing should auto zip any longer. Auto-build should be used only on automations and only for the startup project of a solution.