# Avodel Web Bot

A Symfony bundle providing a foundation for automating web interactions through controlled sequential execution of actions. The bundle handles browser automation, CAPTCHA verification, AJAX waiting, network error recovery, and human-like behavior simulation.

The core concept is a configurable **action pipeline**: a loop that continuously iterates through actions, checking which ones apply to the current page state and executing them. When exceptions occur, the system attempts recovery through exception handlers before stopping.

## Business Logic

### Core Concept

**Problem**: Web automation requires handling dynamic page states, unpredictable errors (network issues, CAPTCHA challenges, frame detachment), and detection avoidance.

**Solution**: A state machine that continuously evaluates and executes applicable actions, with chainable exception handlers for recovery. Each action declares when it applies (`isApplicable`) and what it does (`perform`).

### Main Process

```
1. CLI ENTRY (webbot:run <profile>)
   │
   ├── --ttl              → Time limit in seconds
   └── --pause-between-user-actions → Pause range (e.g., 1000-6000ms)

2. WORKER INITIALIZATION
   │
   ├── Create WebDriver instance
   ├── Build Context with Options
   └── Load action/handler chain from profile

3. ACTION LOOP (while !shouldStop)
   │
   ├── For each action in chain:
   │   ├── Check isApplicable(webDriver, context)
   │   ├── If true → perform(webDriver, context)
   │   ├── Log performed action
   │   └── Restart from first action
   │
   └── On exception:
       ├── Try each exception handler
       ├── If handled (returns true) → continue loop
       └── If unhandled → stop worker

4. TERMINATION
   │
   ├── Stop WebDriver
   ├── Clear states
   └── Log completion
```

### Action Execution Order

The bundle automatically prepends and appends built-in actions around custom actions:

| Order | Action | Purpose |
|-------|--------|---------|
| 1 | `StopWorkerOnTimeLimitAction` | Check/enforce TTL |
| 2 | `StartWebDriverAction` | Initialize browser |
| 3 | `PauseBetweenUserActionsAction` | Simulate human pauses |
| 4* | `WaitForCaptchaWidgetReadinessAction` | Wait for CAPTCHA render |
| 5* | `CaptchaVerifierAction` | Verify CAPTCHA |
| 6 | `WaitUntilAjaxRequestsFinishedAction` | Wait for async requests |
| 7 | **[Custom Actions]** | User-defined logic |
| 8 | `BadGatewayAction` | Detect 502 errors |
| 9 | `ClientClosedRequestAction` | Detect 499 errors |
| 10 | `NetworkErrorAction` | Detect network errors |
| 11 | `FallbackAction` | Catch unexpected states |

*Only if `auto_captcha_verification: true`

### Exception Handler Chain

| Order | Handler | Handles |
|-------|---------|---------|
| 1 | `CaptchaCheckboxFrameDisappearedExceptionHandler` | Transient CAPTCHA frame issues |
| 2 | `FrameNotFoundOrNotSwitchableExceptionHandler` | Frame switching failures |
| 3 | **[Custom Handlers]** | User-defined recovery |
| 4 | `FallbackExceptionHandler` | Generic recovery with backoff |

### State Management

The `Context` persists throughout a run and provides:
- **Run ID**: Unique identifier (40-char hex) for log correlation
- **States**: Type-indexed storage implementing `StateInterface`
- **Options**: Runtime configuration (TTL, pause settings)
- **History**: List of performed actions

States are lazy-initialized via `createDefault()` factory method on first access.

## Data Model

```
Worker (orchestrates execution)
├── actions[]           - ActionInterface implementations
│   └── ActionInterface
│       ├── isApplicable(webDriver, context): bool
│       └── perform(webDriver, context): void
├── exceptionHandlers[] - ExceptionHandlerInterface implementations
│   └── ExceptionHandlerInterface
│       └── handleException(webDriver, context, exception): bool
└── Context (per-run session)
    ├── runId           - unique execution identifier
    ├── states[]        - StateInterface storage
    ├── options         - Options (TTL, pause config)
    └── performedActions[]
```

See `src/Worker/` for interface contracts, `src/Context/` for runtime management.

## External Dependencies

| Service | Purpose | Integration |
|---------|---------|-------------|
| `avodel/web-driver` | Browser automation driver | WebDriver, Frames, AJAX, Mouse control |
| reCAPTCHA v2 | CAPTCHA challenges | Frame detection at `google.com/recaptcha/api2/` |
| hCaptcha | CAPTCHA challenges | Frame detection at `newassets.hcaptcha.com/captcha/` |
| Selenium WebDriver | Browser control | Via `php-webdriver/webdriver` |

## CLI Commands

| Command | Description |
|---------|-------------|
| `webbot:run <profile>` | Execute a worker profile |

**Options:**
- `--ttl <seconds>` — Worker time limit (default: infinite)
- `--pause-between-user-actions <min>-<max>` — Pause range in ms (e.g., `1000-6000`)

**Signal Handling:** Graceful shutdown on `SIGINT`/`SIGTERM`

## Bundle Configuration

```yaml
avodel_web_bot:
    backoff_delay_ms: 10000          # Delay for exception recovery (default: 10000)
    profile:
        my_profile:
            auto_captcha_verification: true  # Enable CAPTCHA auto-handling
            actions:
                - App\Bot\MyAction1
                - App\Bot\MyAction2
            exception_handlers:
                - App\Bot\MyExceptionHandler
```

### Configuration Options

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `backoff_delay_ms` | int | 10000 | Delay (ms) before retry on fallback |
| `profile.<name>.auto_captcha_verification` | bool | true | Auto-detect and verify CAPTCHAs |
| `profile.<name>.actions` | array | [] | Custom action service IDs |
| `profile.<name>.exception_handlers` | array | [] | Custom handler service IDs |

## Installation

```bash
composer require avodel/web-bot
```

Register the bundle in your Symfony application, then configure profiles in `config/packages/avodel_web_bot.yaml`.

## Creating Custom Actions

Implement `ActionInterface`:

```php
use Avodel\WebBot\Worker\ActionInterface;
use Avodel\WebBot\Context\Context;
use Avodel\WebDriver\Driver\WebDriver;

class MyAction implements ActionInterface
{
    public function isApplicable(WebDriver $webDriver, Context $context): bool
    {
        // Return true when this action should execute
        return $webDriver->getPage()->find('css', '#my-element') !== null;
    }

    public function perform(WebDriver $webDriver, Context $context): void
    {
        // Execute the action
        $webDriver->getPage()->find('css', '#my-element')->click();
    }
}
```

## Creating Custom Exception Handlers

Implement `ExceptionHandlerInterface`:

```php
use Avodel\WebBot\Worker\ExceptionHandlerInterface;
use Avodel\WebBot\Context\Context;
use Avodel\WebDriver\Driver\WebDriver;

class MyExceptionHandler implements ExceptionHandlerInterface
{
    public function handleException(
        WebDriver $webDriver,
        Context $context,
        \Throwable $exception
    ): bool {
        if ($exception instanceof MyRecoverableException) {
            // Attempt recovery
            return true; // Handled, continue execution
        }
        return false; // Not handled, try next handler
    }
}
```

## Built-in Extensions

| Extension | Purpose |
|-----------|---------|
| **WebDriver** | Browser lifecycle (start/stop) |
| **TimeLimit** | Enforce TTL on worker runs |
| **Pause** | Human-like delays with weighted random actions |
| **Captcha** | reCAPTCHA v2 and hCaptcha detection/verification |
| **Ajax** | Wait for AJAX completion in all frames |
| **Network** | Detect 502, 499, and network error pages |
| **Frame** | Handle frame switching failures |

### Pause Simulation

The `PauseMouseActionsSimulator` generates human-like behavior:
- Random waits (500-10000ms ranges)
- Scroll actions (100-1000px)
- Mouse movements (100-500px)
- Weighted action selection based on pause duration

## Configuration

| Variable | Description | Default |
|----------|-------------|---------|
| N/A | Bundle uses YAML configuration only | — |

See `config/packages/avodel_web_bot.yaml` for profile configuration.
