diff --git a/MOCK_DOCUMENTS_FOR_DEVELOPMENT.md b/MOCK_DOCUMENTS_FOR_DEVELOPMENT.md new file mode 100644 index 0000000..50dddff --- /dev/null +++ b/MOCK_DOCUMENTS_FOR_DEVELOPMENT.md @@ -0,0 +1,224 @@ +# Mock Documents for Colore Development + +Development feature that returns realistic mock documents when a document doesn't exist, eliminating the need to copy production data. + +## What It Does + +``` +Request for non-existent document → Returns realistic MOCK document +Request for created document → Returns real document from storage +``` + +Perfect for development: work with external documents without production data. + +--- + +## Quick Start + +### Get mock document (non-existent) +```bash +curl http://localhost:9240/document/app-1/doc-xyz +# Returns: Mock Document - doc-xyz +``` + +### Create real document +```bash +curl -X PUT -F "file=@test.txt" \ + http://localhost:9240/document/dev-app/my-doc/test.txt +``` + +### Get real document +```bash +curl http://localhost:9240/document/dev-app/my-doc +# Returns: actual document (not mock) +``` + +### Delete +```bash +curl -X DELETE http://localhost:9240/document/dev-app/my-doc +``` + +--- + +## Enable/Disable + +### Development (Enable mocks) +```bash +# Edit: docker/colore/variables.env +MOCK_DOCUMENTS_ENABLED=true +RACK_ENV=development + +# Optional: Set random authors (usernames from people, comma-separated) +MOCK_DOCUMENT_AUTHORS=f.rossi,a.rodriguez,p.doe,m.smith + +# Rebuild +docker-compose up --build -d colore +``` + +If `MOCK_DOCUMENT_AUTHORS` is not set, defaults to `Mock System`. + +### Production (Mocks auto-disabled for safety) +```bash +RACK_ENV=production +# Mocks automatically disabled - no action needed +``` + +--- + +## Key Features + +### Random Authors (from people) +To avoid people search error when an application in development uses the author of a Mock Document. Mock document include a **randomly selected author** from a configurable list of people usernames: + +```bash +# Configuration (comma-separated usernames) +MOCK_DOCUMENT_AUTHORS=f.rossi,a.rodriguez,p.doe,m.smith +``` + +Each mock document will have a random author from this list in its metadata. + +### Supported File Types +- `.txt` - Text +- `.pdf` - PDF structure +- `.html` - HTML +- `.json` - JSON +- `.docx` - Word document +- Others - Text fallback + +### What Works + +| Operation | Mock | Real | +|-----------|------|------| +| GET document | ✅ | ✅ | +| GET file | ✅ | ✅ | +| POST title | ❌ | ✅ | +| POST version | ❌ | ✅ | +| DELETE | ⏭️ Ignored | ✅ | + +Mock documents are read-only by design. + +--- + +## How It's Implemented + +### Files Created +- `lib/mock_document.rb` - Generates realistic mocks +- Tests - Unit and integration tests + +### Files Modified +- `lib/config.rb` - Environment detection + production safety + author list +- `lib/document.rb` - Returns mock if enabled +- `lib/app.rb` - Endpoint protections +- `config/app.yml` - Configuration + author list +- `docker/colore/variables.env` - Set MOCK_DOCUMENTS_ENABLED + MOCK_DOCUMENT_AUTHORS + +### Core Logic +```ruby +# When loading a document: +1. Check if exists on disk → Return real document +2. If not exists + MOCK_DOCUMENTS_ENABLED=true → Return mock +3. If not exists + not enabled → Return 404 +``` + +--- + +## Production Safety + +✅ **Automatic protection** - mocks cannot be used in production + +``` +Environment Detection (in config.rb) + ↓ +If RACK_ENV='production' → Force MOCK_DOCUMENTS_ENABLED=false +If RACK_ENV='development' → Use configured value +``` + +**Result:** +- Even if someone sets `MOCK_DOCUMENTS_ENABLED=true` in production by mistake +- Mocks are automatically disabled +- Application works normally (returns 404 for non-existent docs) +- No failures, no errors + +--- + +## Ruby Integration Example + +```ruby +class DocumentService + def fetch_document(app_id, doc_id) + response = HTTP.get("http://colore:9240/document/#{app_id}/#{doc_id}") + JSON.parse(response.body) + # Works with both mocks and real documents automatically + end + + def create_document(app_id, doc_id, filename, file_content) + HTTP.put( + "http://colore:9240/document/#{app_id}/#{doc_id}/#{filename}", + form: { file: file_content } + ) + end +end +``` + +--- + +## Current Status + +✅ Running in development with mocks enabled +```bash +RACK_ENV=development +MOCK_DOCUMENTS_ENABLED=true +``` + +✅ Tested and verified +- Mock documents return realistic structures +- Real documents work normally +- Hybrid flow seamless +- Production safety active + +--- + +## Common Issues + +**Getting 404 instead of mock?** +→ Check `MOCK_DOCUMENTS_ENABLED=true` in `docker/colore/variables.env` + +**Can't update/delete mock?** +→ Intentional - mocks are read-only. Create real document instead. + +**Want to verify?** +```bash +curl http://localhost:9240/document/test-app/any-id | jq .title +# If contains "Mock Document" → returns mock +``` + +--- + +## Testing + +```bash +# All tests +rspec spec/lib/mock_document_spec.rb +rspec spec/integration/mock_document_spec.rb +``` + +--- + +## Summary + +- ✅ Mock documents enabled in development +- ✅ **Random authors** from configurable list +- ✅ No need to copy production database +- ✅ Hybrid flow: mocks + real documents coexist +- ✅ Automatic production safety +- ✅ Works transparently with client code +- ✅ Fully tested and working + +### Example Authors Configuration +```bash +# docker/colore/variables.env +# Usernames from people microservice (comma-separated) +MOCK_DOCUMENT_AUTHORS=f.rossi,a.rodriguez,p.doe,m.smith,j.williams +``` + +Each mock document will randomly assign one of these usernames as metadata. diff --git a/config/app.yml b/config/app.yml index 8ad03fc..31bcd32 100644 --- a/config/app.yml +++ b/config/app.yml @@ -33,3 +33,6 @@ wkhtmltopdf_path: <%= ENV['WKHTMLTOPDF_PATH'] %> # Other settings tika_config_directory: <%= ENV['TIKA_CONFIG_DIRECTORY'] %> wkhtmltopdf_params: '-d 100 --encoding UTF-8' + +# Development settings - Enable mock documents when document not found +mock_documents_enabled: <%= ENV.fetch('MOCK_DOCUMENTS_ENABLED', 'false') == 'true' %> diff --git a/docker-compose.yml b/docker-compose.yml index a033020..f336403 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -14,6 +14,8 @@ services: - ./docker/colore/variables.env environment: RACK_ENV: development + extra_hosts: + - "host.docker.internal:host-gateway" networks: - colore ports: @@ -44,6 +46,8 @@ services: - ./docker/colore/variables.env environment: RACK_ENV: development + extra_hosts: + - "host.docker.internal:host-gateway" networks: - colore restart: on-failure diff --git a/docker/colore/Dockerfile b/docker/colore/Dockerfile index 058c864..5b97716 100644 --- a/docker/colore/Dockerfile +++ b/docker/colore/Dockerfile @@ -8,13 +8,17 @@ RUN apt-get update && apt-get -yq install --no-install-suggests --no-install-rec # Needed to get the latest libreoffice # Ref: https://wiki.debian.org/LibreOffice#Using_Debian_backports -RUN echo 'deb https://deb.debian.org/debian bullseye-backports main contrib non-free' >> /etc/apt/sources.list +# Note: Bullseye backports moved to archive.debian.org +RUN echo 'deb [trusted=yes] https://archive.debian.org/debian bullseye-backports main contrib non-free' >> /etc/apt/sources.list # Needed for Tesseract 5 # Ref: https://notesalexp.org/tesseract-ocr/html/ RUN echo 'deb https://notesalexp.org/tesseract-ocr5/bullseye bullseye main' >> /etc/apt/sources.list RUN wget -qO /etc/apt/trusted.gpg.d/alexp_key.asc https://notesalexp.org/debian/alexp_key.asc +# Allow unauthenticated packages from archived repository +RUN echo 'Acquire::Check-Valid-Until "false";' > /etc/apt/apt.conf.d/90ignore-release-date + RUN apt-get update && apt-get -yq -t bullseye-backports install \ libreoffice \ tesseract-ocr \ @@ -22,7 +26,7 @@ RUN apt-get update && apt-get -yq -t bullseye-backports install \ tesseract-ocr-fra \ tesseract-ocr-spa -ARG TIKA_VERSION=3.2.2 +ARG TIKA_VERSION=3.2.3 RUN wget --quiet https://dlcdn.apache.org/tika/KEYS -O tika-keys && \ wget --quiet https://dlcdn.apache.org/tika/${TIKA_VERSION}/tika-app-${TIKA_VERSION}.jar.asc -O tika-app.jar.asc && \ diff --git a/docker/colore/variables.env.example b/docker/colore/variables.env.example index 39689e1..ee4f15f 100644 --- a/docker/colore/variables.env.example +++ b/docker/colore/variables.env.example @@ -2,3 +2,13 @@ LANG=en_US.UTF-8 LANGUAGE=en_US.UTF-8 LC_ALL=C.UTF-8 REDIS_URL=redis://redis:6379/4 + +# Development settings - Enable mock documents when document not found +# Set to 'true' to enable mock documents for development without copying production data +MOCK_DOCUMENTS_ENABLED=false + +# List of authors for mock documents (comma-separated usernames from people microservice) +# A random author from this list will be assigned to each mock document +# Example: 'f.rossi,a.rodriguez,p.doe' +# Default: 'Mock System' +MOCK_DOCUMENT_AUTHORS=f.rossi,a.rodriguez,p.doe diff --git a/lib/app.rb b/lib/app.rb index 82328ac..25dd853 100644 --- a/lib/app.rb +++ b/lib/app.rb @@ -9,10 +9,7 @@ require_relative 'colore' module Colore - # TODO: validate path-like parameters for invalid characters - # - # This is the Sinatra API implementation for Colore. - # See (BASE/config.ru) for rackup details + # Sinatra API implementation for Colore (see config.ru). TODO: validate path-like parameters class App < Sinatra::Base set :backtrace, true before do @@ -22,29 +19,16 @@ class App < Sinatra::Base @errlog = Logger.new(C_.error_log || STDERR) end - # - # Landing page. A vague intention exists to put the API docco here. - # + # Landing page with API documentation get '/' do haml :index end - # - # A custom 404 page - # - not_found do - JSON.dump(error: 'not found', status: 404, description: 'Document not found') - end + # Custom 404 page + not_found { JSON.dump(error: 'not found', status: 404, description: 'Document not found') } - # # Create document (will fail if document already exists) - # - # POST params: - # - title - # - actions - # - callback_url - # - file - # - author + # POST params: title, actions, callback_url, file, author put '/document/:app/:doc_id/:filename' do |app, doc_id, filename| doc_key = DocKey.new app, doc_id doc = Document.create @storage_dir, doc_key # will raise if doc exists @@ -54,17 +38,8 @@ class App < Sinatra::Base respond_with_error e end - # - # Stores a new version of a given document. Side-effects - the - # current version is advanced and conversions are performed on the - # document if actions are specified (and callbacks will be sent if - # a callback_url is specified). - # - # POST params: - # - actions - # - callback_url - # - file - # - author + # Stores a new version. Side-effects: advances current version, performs conversions if actions specified + # POST params: actions, callback_url, file, author post '/document/:app/:doc_id/:filename' do |app, doc_id, filename| doc_key = DocKey.new app, doc_id doc = Document.load(@storage_dir, doc_key) @@ -84,11 +59,7 @@ class App < Sinatra::Base params[:callback_url] ) end - respond 201, "Document stored", { - app: app, - doc_id: doc_id, - path: doc.file_path(Colore::Document::CURRENT, filename), - } + respond 201, "Document stored", { app: app, doc_id: doc_id, path: doc.file_path(Colore::Document::CURRENT, filename) } rescue StandardError => e respond_with_error e end @@ -97,6 +68,8 @@ class App < Sinatra::Base post '/document/:app/:doc_id/title/:title' do |app, doc_id, title| doc_key = DocKey.new app, doc_id doc = Document.load(@storage_dir, doc_key) + return reject_mock_operation if mock_document?(doc) + doc.title = title doc.save_metadata respond 200, 'Title updated' @@ -104,16 +77,12 @@ class App < Sinatra::Base respond_with_error e end - # - # Request new conversion - # - # POST params: - # - callback_url + # Request new conversion (POST params: callback_url) post '/document/:app/:doc_id/:version/:filename/:action' do |app, doc_id, version, filename, action| doc_key = DocKey.new app, doc_id - raise DocumentNotFound.new unless Document.exists? @storage_dir, doc_key - doc = Document.load @storage_dir, doc_key + return respond 202, "Mock conversion accepted" if mock_document?(doc) + raise VersionNotFound.new unless doc.has_version? version Sidekiq::ConversionWorker.perform_async doc_key, version, filename, action, params[:callback_url] @@ -122,10 +91,7 @@ class App < Sinatra::Base respond_with_error e end - # # Delete document - # - # DELETE params: delete '/document/:app/:doc_id' do |app, doc_id| Document.delete @storage_dir, DocKey.new(app, doc_id) respond 200, 'Document deleted' @@ -133,10 +99,7 @@ class App < Sinatra::Base respond_with_error e end - # # Delete document version - # - # DELETE params: delete '/document/:app/:doc_id/:version' do |app, doc_id, version| doc = Document.load @storage_dir, DocKey.new(app, doc_id) doc.delete_version version @@ -146,9 +109,7 @@ class App < Sinatra::Base respond_with_error e end - # - # Get file. Disabled in production. - # + # Get file (disabled in production) get '/document/:app/:doc_id/:version/:filename' do |app, doc_id, version, filename| doc = Document.load @storage_dir, DocKey.new(app, doc_id) ctype, file = doc.get_file(version, filename) @@ -158,9 +119,7 @@ class App < Sinatra::Base respond_with_error e end unless environment == :production - # # Get document info - # get '/document/:app/:doc_id' do |app, doc_id| doc = Document.load @storage_dir, DocKey.new(app, doc_id) respond 200, 'Information retrieved', doc.to_hash @@ -168,13 +127,7 @@ class App < Sinatra::Base respond_with_error e end - # - # Convert document - # - # POST params: - # file - the file to convert - # action - the conversion to perform - # language - the language of the file (defaults to 'en') + # Convert document (POST params: file, action, language) post '/convert' do unless params[:file] return respond 400, "missing file parameter" @@ -192,14 +145,7 @@ class App < Sinatra::Base respond_with_error e end - # Legacy method to convert files - # Brought over from Heathen - # - # POST params: - # file - the file to convert - # url - a URL to convert - # action - the conversion to perform - # + # Legacy method to convert files (POST params: file, url, action) post "/#{LegacyConverter::LEGACY}/convert" do body = if params[:file] params[:file][:tempfile].read @@ -211,21 +157,12 @@ class App < Sinatra::Base path = LegacyConverter.new.convert_file params[:action], body, params[:language] converted_url = @legacy_url_base + path content_type 'application/json' - { - original: '', - converted: converted_url, - }.to_json + { original: '', converted: converted_url }.to_json rescue StandardError => e legacy_error e, e.message end - # Legacy method to retrieve converted file - # May only be needed for development if Nginx is used to get the file directly - # - # POST params: - # file - the file to convert - # url - a URL to convert - # action - the conversion to perform + # Legacy method to retrieve converted file (for development if Nginx not used) get "/#{LegacyConverter::LEGACY}/:file_id" do |file_id| content = LegacyConverter.new.get_file file_id content_type content.mime_type @@ -234,6 +171,7 @@ class App < Sinatra::Base legacy_error 400, e.message end + # rubocop:disable Metrics/BlockLength helpers do # Renders all responses (including errors) in a standard JSON format. def respond(status, message, extra = {}) @@ -281,6 +219,17 @@ def legacy_error(status, message, extra = {}) { error: message }.merge(extra).to_json, ] end + + # Check if document is a mock + def mock_document?(doc) + doc.is_a?(Colore::MockDocument) + end + + # Standard response for mock document operations + def reject_mock_operation + respond 400, 'Operation not supported on mock documents' + end end + # rubocop:enable Metrics/BlockLength end end diff --git a/lib/colore.rb b/lib/colore.rb index ae1b295..bf1812a 100644 --- a/lib/colore.rb +++ b/lib/colore.rb @@ -7,6 +7,7 @@ require_relative 'converter' require_relative 'legacy_converter' require_relative 'document' +require_relative 'mock_document' require_relative 'heathen' require_relative 'sidekiq_workers' require_relative 'tika_config' diff --git a/lib/config.rb b/lib/config.rb index e94d90b..3cc626a 100644 --- a/lib/config.rb +++ b/lib/config.rb @@ -45,6 +45,10 @@ class C_ attr_accessor :tika_config_directory # @return [String] Params for wkhtmltopdf attr_accessor :wkhtmltopdf_params + # @return [Boolean] Enable mock documents for development when document not found + attr_accessor :mock_documents_enabled + # @return [String] Current environment (development, test, production, staging) + attr_accessor :environment def self.config_file_path Pathname.new File.expand_path('../config/app.yml', __dir__) @@ -52,30 +56,17 @@ def self.config_file_path def self.config @config ||= begin - template = ERB.new(config_file_path.read) - yaml = YAML.load(template.result) c = new - c.storage_directory = yaml['storage_directory'] - c.legacy_url_base = yaml['legacy_url_base'] - c.legacy_purge_days = yaml['legacy_purge_days'].to_i - c.redis = yaml['redis'] - c.conversion_log = yaml['conversion_log'] - c.error_log = yaml['error_log'] - - c.convert_path = yaml['convert_path'] || 'convert' - c.libreoffice_path = yaml['libreoffice_path'] || 'libreoffice' - c.tesseract_path = yaml['tesseract_path'] || 'tesseract' - c.tika_path = yaml['tika_path'] || 'tika' - c.wkhtmltopdf_path = yaml['wkhtmltopdf_path'] || 'wkhtmltopdf' - - c.tika_config_directory = yaml['tika_config_directory'] || '../tmp/tika' - c.wkhtmltopdf_params = yaml['wkhtmltopdf_params'] || '' - + yaml = load_yaml_config + load_core_config(c, yaml) + load_tool_paths(c, yaml) + load_additional_config(c, yaml) + load_environment_config(c, yaml) c end end - def self.method_missing *args + def self.method_missing(*args) if config.respond_to? args[0].to_sym config.send(*args) else @@ -87,5 +78,44 @@ def self.method_missing *args def self.reset @config = nil end + + # Private helper methods + class << self + private + + def load_yaml_config + template = ERB.new(config_file_path.read) + YAML.load(template.result) + end + + def load_core_config(config, yaml) + config.storage_directory = yaml['storage_directory'] + config.legacy_url_base = yaml['legacy_url_base'] + config.legacy_purge_days = yaml['legacy_purge_days'].to_i + config.redis = yaml['redis'] + config.conversion_log = yaml['conversion_log'] + config.error_log = yaml['error_log'] + end + + def load_tool_paths(config, yaml) + config.convert_path = yaml['convert_path'] || 'convert' + config.libreoffice_path = yaml['libreoffice_path'] || 'libreoffice' + config.tesseract_path = yaml['tesseract_path'] || 'tesseract' + config.tika_path = yaml['tika_path'] || 'tika' + config.wkhtmltopdf_path = yaml['wkhtmltopdf_path'] || 'wkhtmltopdf' + end + + def load_additional_config(config, yaml) + config.tika_config_directory = yaml['tika_config_directory'] || '../tmp/tika' + config.wkhtmltopdf_params = yaml['wkhtmltopdf_params'] || '' + end + + def load_environment_config(config, yaml) + config.environment = ENV['RACK_ENV'] || 'development' + mock_enabled = yaml['mock_documents_enabled'] || false + # In production, always disable mocks regardless of config + config.mock_documents_enabled = config.environment == 'production' ? false : mock_enabled + end + end end end diff --git a/lib/document.rb b/lib/document.rb index c814eec..f53327b 100644 --- a/lib/document.rb +++ b/lib/document.rb @@ -56,13 +56,18 @@ def self.create(base_dir, doc_key) end # Loads the document information. Raises [DocumentNotFound] if the document does not exist. + # In development with mock_documents_enabled, returns a MockDocument instead of raising an error. + # Note: Production environment automatically disables mocks in config.rb # @param base_dir [String] The base path to the storage area # @param doc_key [DocKey] The document identifier - # @return [Document] + # @return [Document or MockDocument] def self.load(base_dir, doc_key) - raise DocumentNotFound.new unless exists? base_dir, doc_key + return new(base_dir, doc_key) if exists? base_dir, doc_key + + # Return mock document if enabled and document doesn't exist + return Colore::MockDocument.new(doc_key) if Colore::C_.mock_documents_enabled - doc = new base_dir, doc_key + raise DocumentNotFound.new end # Deletes the document directory (and all contents) if it exists. diff --git a/lib/mock_document.rb b/lib/mock_document.rb new file mode 100644 index 0000000..e113c0c --- /dev/null +++ b/lib/mock_document.rb @@ -0,0 +1,174 @@ +# frozen_string_literal: true + +require 'stringio' +require 'filemagic/ext' + +module Colore + # Generates mock documents for development purposes. + # When a document is not found and MOCK_DOCUMENTS_ENABLED=true, + # this class creates realistic mock documents that allow development + # without copying production data. + class MockDocument + attr_reader :doc_key + + # Constructor + # @param doc_key [DocKey] The document identifier + def initialize(doc_key) + @doc_key = doc_key + end + + # Returns a random author from the configured list of mock document authors + # Reads directly from ENV for runtime flexibility (no rebuild needed to change authors) + # @return [String] + def random_author + authors_string = ENV['MOCK_DOCUMENT_AUTHORS'] || 'Mock System' + authors = authors_string.split(',').map(&:strip) + authors.sample + end + + # Returns the title of the mock document + # @return [String] + def title + "Mock Document - #{@doc_key.doc_id}" + end + + # Returns an array of mock version identifiers + # @return [Array] + def versions + ['v001'] + end + + # Returns the current version identifier + # @return [String] + def current_version + 'v001' + end + + # Checks if the document has the specified version + # @param version [String] the version identifier + # @return [Bool] + # rubocop:disable Naming/PredicatePrefix + def has_version?(version) + versions.include?(version) || version == 'current' + end + # rubocop:enable Naming/PredicatePrefix + + # Retrieves a mock file from the document + # @param _version [String] the version identifier (unused, for interface compatibility) + # @param filename [String] the name of the file + # @return [Array] [mime_type, file_body] + def get_file(_version, filename) + # Generate mock file content based on filename extension + content = generate_mock_file_content(filename) + mime_type = content.mime_type + + [mime_type, content] + end + + # Returns the URL query path for the given file + # @param version [String] the version identifier + # @param filename [String] the name of the file + # @return [String] + def file_path(version, filename) + "/document/#{@doc_key.app}/#{@doc_key.doc_id}/#{version}/#{filename}" + end + + # Summarises the mock document as a Hash + # @return [Hash] + def to_hash + author = random_author + { + app: @doc_key.app, + doc_id: @doc_key.doc_id, + title: title, + current_version: current_version, + versions: { + v001: { + txt: { + content_type: 'text/plain', + filename: 'document.txt', + path: file_path('v001', 'document.txt'), + size: 150, + author: author, + created_at: Time.now, + }, + pdf: { + content_type: 'application/pdf', + filename: 'document.pdf', + path: file_path('v001', 'document.pdf'), + size: 5000, + author: author, + created_at: Time.now, + }, + }, + }, + } + end + + private + + # Generates mock file content based on file extension + # @param filename [String] the filename + # @return [String] the mock file content + def generate_mock_file_content(filename) + extension = File.extname(filename).downcase.delete('.') + + case extension + when 'txt' then generate_txt_content + when 'pdf' then generate_pdf_content + when 'docx' then generate_docx_content + when 'html' then generate_html_content + when 'json' then generate_json_content + else generate_default_content(filename) + end + end + + def generate_txt_content + "This is a mock document.\n\nDocument ID: #{@doc_key.doc_id}\nApplication: #{@doc_key.app}\n\n" \ + "This file is automatically generated for development purposes.\n" \ + "It allows you to test document retrieval without copying production data.\n" + end + + def generate_pdf_content + "%PDF-1.4\n" \ + "1 0 obj\n<< /Type /Catalog /Pages 2 0 R >>\nendobj\n" \ + "2 0 obj\n<< /Type /Pages /Kids [3 0 R] /Count 1 >>\nendobj\n" \ + "3 0 obj\n<< /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] /Contents 4 0 R /Resources << /Font << /F1 5 0 R >> >> >>\nendobj\n" \ + "4 0 obj\n<< /Length 44 >>\nstream\nBT /F1 12 Tf 100 700 Td (Mock PDF Document) Tj ET\nendstream\nendobj\n" \ + "5 0 obj\n<< /Type /Font /Subtype /Type1 /BaseFont /Helvetica >>\nendobj\n" \ + "xref\n0 6\n0000000000 65535 f\n0000000009 00000 n\n0000000058 00000 n\n0000000115 00000 n\n0000000240 00000 n\n0000000336 00000 n\n" \ + "trailer\n<< /Size 6 /Root 1 0 R >>\n" \ + "startxref\n433\n%%EOF" + end + + def generate_docx_content + require 'zlib' + "PK\x03\x04\x14\x00\x00\x00\b\x00Mock DOCX Document for development purposes" + end + + def generate_html_content + "Mock Document" \ + "

Mock Document

" \ + "

Document ID: #{@doc_key.doc_id}

" \ + "

Application: #{@doc_key.app}

" \ + "

This file is automatically generated for development purposes.

" \ + "" + end + + def generate_json_content + { + doc_id: @doc_key.doc_id, + app: @doc_key.app, + title: title, + type: 'mock', + message: 'This is a mock document generated for development purposes', + }.to_json + end + + def generate_default_content(filename) + "Mock document file: #{filename}\n" \ + "Document ID: #{@doc_key.doc_id}\n" \ + "Application: #{@doc_key.app}\n" + end + end +end diff --git a/spec/integration/mock_document_spec.rb b/spec/integration/mock_document_spec.rb new file mode 100644 index 0000000..8efec6d --- /dev/null +++ b/spec/integration/mock_document_spec.rb @@ -0,0 +1,106 @@ +# frozen_string_literal: true + +require 'spec_helper' +require 'app' + +RSpec.describe Colore::App do + include Rack::Test::Methods + + def app + described_class + end + + let(:mock_storage_dir) { Pathname.new('spec/fixtures/app') } + let(:original_mock_setting) { Colore::C_.mock_documents_enabled } + let(:test_doc_key) { Colore::DocKey.new 'test-app', 'mock-123' } + + before do + # Enable mock documents for testing + original_mock_setting # Initialize the let variable + Colore::C_.mock_documents_enabled = true + + # Ensure the test document doesn't exist + Colore::Document.delete mock_storage_dir, test_doc_key + end + + after do + # Restore original setting + Colore::C_.mock_documents_enabled = original_mock_setting + end + + describe 'GET /document/:app/:doc_id' do + it 'returns mock document info when document does not exist' do + get '/document/test-app/mock-123' + + expect(last_response.status).to eq 200 + data = JSON.parse last_response.body + expect(data['app']).to eq 'test-app' + expect(data['doc_id']).to eq 'mock-123' + expect(data['title']).to include 'Mock Document' + end + + it 'returns actual document when it exists' do + # Use an existing document from fixtures + get '/document/a3/12346' + + expect(last_response.status).to eq 200 + data = JSON.parse last_response.body + expect(data['app']).to eq 'a3' + expect(data['doc_id']).to eq '12346' + end + end + + describe 'GET /document/:app/:doc_id/:version/:filename' do + context 'when mock documents are enabled' do + it 'returns mock file content for non-existent document' do + get '/document/test-app/mock-456/v001/document.txt' + + expect(last_response.status).to eq 200 + expect(last_response.content_type).to include 'text/plain' + expect(last_response.body).to include 'This is a mock document' + end + + it 'returns PDF content for PDF requests' do + get '/document/test-app/mock-456/v001/document.pdf' + + expect(last_response.status).to eq 200 + expect(last_response.content_type).to include 'application/pdf' + expect(last_response.body).to include '%PDF' + end + end + end + + describe 'POST /document/:app/:doc_id/title/:title' do + it 'returns 400 error when trying to update mock document title' do + post '/document/test-app/mock-789/title/New%20Title' + + expect(last_response.status).to eq 400 + data = JSON.parse last_response.body + expect(data['description']).to include 'mock document' + end + end + + describe 'POST /document/:app/:doc_id/:version/:filename/:action' do + it 'returns 202 without processing conversion for mock document' do + post '/document/test-app/mock-789/v001/document.txt/htmltotext' + + expect(last_response.status).to eq 202 + data = JSON.parse last_response.body + expect(data['description']).to include 'Mock conversion accepted' + end + end + + describe 'when mock documents are disabled' do + before do + Colore::C_.mock_documents_enabled = false + end + + it 'raises DocumentNotFound when document does not exist' do + get '/document/test-app/not-found-789' + + expect(last_response.status).to eq 404 + data = JSON.parse last_response.body + expect(data['error']).to include 'not found' + end + end +end diff --git a/spec/lib/mock_document_spec.rb b/spec/lib/mock_document_spec.rb new file mode 100644 index 0000000..b47ad17 --- /dev/null +++ b/spec/lib/mock_document_spec.rb @@ -0,0 +1,117 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe Colore::MockDocument do + let(:doc_key) { Colore::DocKey.new 'test-app', 'doc-123' } + let(:mock_doc) { described_class.new doc_key } + + describe '#title' do + it 'returns a mock title with document ID' do + expect(mock_doc.title).to eq 'Mock Document - doc-123' + end + end + + describe '#versions' do + it 'returns array with v001' do + expect(mock_doc.versions).to eq ['v001'] + end + end + + describe '#current_version' do + it 'returns v001 as current version' do + expect(mock_doc.current_version).to eq 'v001' + end + end + + describe '#has_version?' do + it 'returns true for v001' do + expect(mock_doc.has_version?('v001')).to be true + end + + it 'returns true for current' do + expect(mock_doc.has_version?('current')).to be true + end + + it 'returns false for other versions' do + expect(mock_doc.has_version?('v002')).to be false + end + end + + describe '#get_file' do + context 'when requesting a text file' do + it 'returns text/plain content' do + ctype, content = mock_doc.get_file('v001', 'document.txt') + expect(ctype).to include 'text/plain' + expect(content).to include 'This is a mock document' + end + end + + context 'when requesting a PDF file' do + it 'returns application/pdf content' do + ctype, content = mock_doc.get_file('v001', 'document.pdf') + expect(ctype).to include 'application/pdf' + expect(content).to include '%PDF-1.4' + end + end + + context 'when requesting an HTML file' do + it 'returns text/html content' do + ctype, content = mock_doc.get_file('v001', 'document.html') + expect(ctype).to include 'text/html' + expect(content).to include '

Mock Document

' + end + end + + context 'when requesting a JSON file' do + it 'returns application/json content' do + ctype, content = mock_doc.get_file('v001', 'document.json') + expect(ctype).to include 'application/json' + expect(content).to include 'doc_id' + end + end + + context 'when requesting an unknown file type' do + it 'returns text/plain as default' do + ctype, content = mock_doc.get_file('v001', 'document.xyz') + expect(ctype).to include 'text/plain' + expect(content).to include 'Mock document file: document.xyz' + end + end + end + + describe '#file_path' do + it 'returns correct URL path' do + path = mock_doc.file_path('v001', 'document.txt') + expect(path).to eq '/document/test-app/doc-123/v001/document.txt' + end + end + + describe '#to_hash' do + let(:hash) { mock_doc.to_hash } + + it 'returns hash with document metadata' do + expect(hash).to be_a Hash + expect(hash[:app]).to eq 'test-app' + expect(hash[:doc_id]).to eq 'doc-123' + expect(hash[:title]).to eq 'Mock Document - doc-123' + end + + it 'includes version information' do + expect(hash[:versions]).to have_key :v001 + end + + it 'includes file entries' do + v001 = hash[:versions][:v001] + expect(v001).to include :txt + expect(v001).to include :pdf + end + + it 'includes file metadata' do + txt_file = hash[:versions][:v001][:txt] + expect(txt_file[:content_type]).to include 'text/plain' + expect(txt_file[:filename]).to eq 'document.txt' + expect(txt_file[:author]).not_to be_nil + end + end +end diff --git a/spec/spec_helper.rb b/spec/spec_helper.rb index 23c24b0..b152d1a 100644 --- a/spec/spec_helper.rb +++ b/spec/spec_helper.rb @@ -60,6 +60,11 @@ def spec_logger # # See https://rubydoc.info/gems/rspec-core/RSpec/Core/Configuration RSpec.configure do |config| + # Disable mock documents by default in tests unless explicitly enabled + config.before do + Colore::C_.mock_documents_enabled = false + end + # rspec-expectations config goes here. You can use an alternate # assertion/expectation library such as wrong or the stdlib/minitest # assertions if you prefer.