March 8, 2023

Writing an Analytics Server using Vapor Part 8 - Prepopulating the Database

In part 7, I created a macOS app that can act as a testbed for the analytics server. It posts UserEvent objects to the userevent endpoint and it reads UserEvents from userevents, userevents/count and users/count. But the server is written to accept several queries to filter its results. The client should be able to send a range of requests with the appropriate queries. Eventually, I should be able to get nice summaries of the user activity in the app.

Of course, these queries won't be very meaningful without data to query. So I'm in a sort of chicken and the egg situation. I need to build a UI to post and display filtered queries of a relatively large set of data, but I won't have that large set of data until I've deployed the server and started receiving posts from my app.

So while I'm developing the app, I'm going to need to populate the server with fake data, and a fair bit of it.

I'll write another migration that creates a large number of UserEvents and adds them to the database. It will only run when a special argument is passed to vapor at launch. The special argument will be interpreted as a custom Environment by Vapor. Of course, in my tests I'll need to use this custom Environment as well.

Updating my Test Cases

As I go to create a new test case for this new migration, I notice a pattern in the test cases I've written so far. Every one of them includes the following code to set up and tear down the server:

    private var sut: Application!
    
    override func setUp() {
        sut = Application(.testing)
        try! configure(sut)
   }
    
    override func tearDown() {
        sut.shutdown()
    }

This is repetitive and fragile.

I do a little digging around and find that Vapor provides a XCTestCase subsclass that already more or less does what I've been trying to do here. XCTVaporTests holds an Application variable (that can be changed by a subclass), sets up the Application in its setup() method and stops the server in its tearDown() method.

What it doesn't do is allow me to declare a different Environment in a standardized way. For writing the PopulateWithRandomUserEvents migration, I'm going to need to use a different Environment, while all my other tests should happen with the standard .testing Environment.

So I write a subclass of XCTVaporTests that sets up a .testing Environment by default but allows subclasses to override which Environment is used. While I'm at it, I write a little makeSUT() method so that I can write my tests in the way that I prefer.

class SimpleVaporTests: XCTVaporTests {
    
    public class var environment: Environment { .testing }

    override class func setUp() {
        XCTVapor.app = { Application(environment) }
    }
    
    func makeSUT() throws -> Application {
        try configure(app)
        return app
    }
}

I change all my tests cases to derive from SimpleVaporTests and get rid of all that boilerplate code.

The Custom Environment

I don't want PopulateWithRandomUserEvents to run every time the server launches. I only want it to run when I need a large set of data to test against. Enter Vapor's Environment (docs). In my test case, I set a custom Environment that sets a special key-value pair in its arguments, passing in the same arguments as the .testing Environment uses.

    private static let env = Environment(name: PopulateWithRandomUserEvents.prepopulate, arguments: Environment.testing.arguments)
    override class var environment: Environment { env }

I add a static function in PopulateWithRandomUserEvents that checks the name of the Environment to decide whether the migration should be run:

    static func shouldPrepopulate(for app: Application) -> Bool {
        app.environment.name == PopulateWithRandomUserEvents.prepopulate
    }

and I check against this function in configure.swift:

    if PopulateWithRandomUserEvents.shouldPrepopulate(for: app) {
        app.migrations.add(PopulateWithRandomUserEvents())
    }

Finally, in PopulateWithRandomUserEvents.prepare(on database: FluentKit.Database), I log to the console using Vapor's Logger (docs

    let logger = Logger(label: String(describing: PopulateWithRandomUserEvents.self))
    private func log(_ string: Logger.Message) {
        logger.info(string)
    }
    
    func prepare(on database: FluentKit.Database) async throws {

        log("Populating database with random events")
    }

Now I add a custom --env argument for the current scheme, passing in the custom Environment name from shouldPrepopulate(for:):

screen shot of scheme editor showing custom environment argument

(see full screen)

I then hit command-R to see if the log appears in the console.

[ INFO ] [Migrator] Starting prepare [database-id: sqlite, migration: App.CreateUserEventRecordTable]
[ INFO ] [Migrator] Finished prepare [database-id: sqlite, migration: App.CreateUserEventRecordTable]
[ INFO ] [Migrator] Starting prepare [database-id: sqlite, migration: App.PopulateWithRandomUserEvents]
[ INFO ] Populating database with random events
[ INFO ] [Migrator] Finished prepare [database-id: sqlite, migration: App.PopulateWithRandomUserEvents]
[ NOTICE ] Server starting on http://127.0.0.1:8080

I stop that process and go to the Terminal to try the same thing:

% vapor run serve -e prepopulate_with_random
Building for debugging...
Build complete! (0.37s)
[ INFO ] [Migrator] Starting prepare [database-id: sqlite, migration: App.CreateUserEventRecordTable]
[ INFO ] [Migrator] Finished prepare [database-id: sqlite, migration: App.CreateUserEventRecordTable]
[ INFO ] [Migrator] Starting prepare [database-id: sqlite, migration: App.PopulateWithRandomUserEvents]
[ INFO ] Populating database with random events
[ INFO ] [Migrator] Finished prepare [database-id: sqlite, migration: App.PopulateWithRandomUserEvents]
[ NOTICE ] Server starting on http://127.0.0.1:8080

Taking Arguments from the Environment

It would be nice to be able to decide at runtime how many UserEvents are created and for how many users. If I create thousands of events for each test, then my testing is going to get bogged down. Yet I'm going to need thousands of events when I'm developing the client app.

Luckily, Environment has a static function to read environment variables as a String that's called get().

Unfortunately, retrieving values as a String and then converting them to a usable value with a fallback default is ugly boilerplate code, even with this convenience. So I extend Environment with a property wrapper for reading values of different types from the environment:

protocol EnvironmentKey {
    init?(_ string: String)
}
extension EnvironmentKey {
    init?(_ string: String?) {
        guard let string else { return nil }
        self.init(string)
    }
}

extension Int: EnvironmentKey {}
extension Double: EnvironmentKey {}
extension String: EnvironmentKey {}
extension Bool: EnvironmentKey {}

// MARK: -

extension Environment {
    
    @propertyWrapper
    struct Key<Value: EnvironmentKey> {
        let key: String
        let defaultValue: Value
        
        init(_ key: String, _ defaultValue: Value) {
            self.key = key
            self.defaultValue = defaultValue
        }
        
        var wrappedValue: Value {
            get {
                return Value(Environment.get(key)) ?? defaultValue
            }
        }
    }
}

I then add properties to PopulateWithRandomUserEvents for the values I'd like to be able to get from the environment:

    @Environment.Key("count", .prepopulatedEventCount) var count: Int
    @Environment.Key("users", .prepopulatedUserCount) var userCount: Int
    @Environment.Key("timespan", .prepopulatedTimeSpan) var timespan: TimeInterval

and I add logging of these to the prepare(on:) method:

    func prepare(on database: FluentKit.Database) async throws {

        log("Populating database with random events")
        
        log("Populated database with \(count) events for \(userCount) users, starting at \(Date().addingTimeInterval(-timespan))")
    }

Sadly, this is also testing that can't be automated. It has to be done manually using XCode's scheme editor.

screen shot of scheme editor showing custom environment argument and custom environment variables

(see full screen)

This works when I run it in XCode:

[ INFO ] [Migrator] Starting prepare [database-id: sqlite, migration: App.CreateUserEventRecordTable]
[ INFO ] [Migrator] Finished prepare [database-id: sqlite, migration: App.CreateUserEventRecordTable]
[ INFO ] [Migrator] Starting prepare [database-id: sqlite, migration: App.PopulateWithRandomUserEvents]
[ INFO ] Populating database with random events
[ INFO ] Populated database with 20000 events for 3 users, starting at 2023-03-08 12:56:04 +0000
[ INFO ] [Migrator] Finished prepare [database-id: sqlite, migration: App.PopulateWithRandomUserEvents]
[ NOTICE ] Server starting on http://127.0.0.1:8080

and when I run it from the Terminal:

% export users=15
% export count=1500
% export timespan=3600
% vapor run serve -e prepopulate_with_random
Building for debugging...
[2/2] Emitting module App
Build complete! (1.59s)
[ INFO ] [Migrator] Starting prepare [database-id: sqlite, migration: App.CreateUserEventRecordTable]
[ INFO ] [Migrator] Finished prepare [database-id: sqlite, migration: App.CreateUserEventRecordTable]
[ INFO ] [Migrator] Starting prepare [database-id: sqlite, migration: App.PopulateWithRandomUserEvents]
[ INFO ] Populating database with random events
[ INFO ] Populated database with 1500 events for 15 users, starting at 2023-03-09 12:35:31 +0000
[ INFO ] [Migrator] Finished prepare [database-id: sqlite, migration: App.PopulateWithRandomUserEvents]
[ NOTICE ] Server starting on http://127.0.0.1:8080

Prepopulating the Database

Now, finally, I have things set up so that I can actually populate the database.

I add some test cases to make sure that the database is prepopulated appropriately. I do this iteratively, but I won't bore you with the step-by-step:

    func test_populates_database() throws {
        let sut = try makeSUT()
                
        try sut.test(.GET, UsersController.countPath) { response in
            let received = try JSONDecoder().decode(Int.self, from: response.body)
            XCTAssert(received > 0)
        }
    }

    func test_populates_database_with_expected_count() throws {
        let sut = try makeSUT()
                
        try sut.test(.GET, countPath()) { response in
            let received = try JSONDecoder().decode(Int.self, from: response.body)
            XCTAssertEqual(received, .prepopulatedEventCount)
        }
    }

    func test_populates_database_with_events_in_expected_timescale() throws {
        let sut = try makeSUT()
                
        let now = Date()
        let threeYearsAgo = now.addingTimeInterval(-.prepopulatedTimeSpan)
        let endOfDay = Calendar.current.startOfDay(for: now.addingTimeInterval(.oneDay))
        try sut.test(.GET, countPath(startDate: threeYearsAgo, endDate: endOfDay)) { response in
            let received = try JSONDecoder().decode(Int.self, from: response.body)
            XCTAssertEqual(received, .prepopulatedEventCount)
        }
    }
    
    
    func test_populates_database_with_expected_number_of_users() throws {
        let sut = try makeSUT()
                        
        try sut.test(.GET, UsersController.countPath) { response in
            let received = try JSONDecoder().decode(Int.self, from: response.body)
            
            // since event creation is random,
            // it's possible that an event won't be created for every user
            // but we can make sure that not too many users are created
            XCTAssert(received <= .prepopulatedUserCount)
        }
    }

Now implementing the actual populating of the database is pretty straightforward:

    func prepare(on database: FluentKit.Database) async throws {

        log("Populating database with random events")

        let users = (0..<userCount).map { _ in UUID() }
        
        for _ in 0..<count {
            let event = UserEvent.random(for: users.randomElement()!, in: -timespan ... 0)
            let record = UserEventRecord(event)
            try await record.create(on: database)
        }

        log("Populated database with \(count) events for \(userCount) users, starting at \(Date().addingTimeInterval(-timespan))")
    }

Finally, I run the server and leave it running. I then run the client project and select the Events tab and I see what I would expect.

screen shot of client app showing 2000 events and 3 users

What's Next

Now that I can prepopulate the database, it's finally time to start requesting and displaying the data in the client app. But that's for next time.

Posts in this Series:

Tagged with: