---
title: "Direct Torch Gradient Methods"
output:
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 3
always_allow_html: yes
vignette: >
  %\VignetteIndexEntry{Direct Torch Gradient Methods}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = torch::torch_is_installed(),
  fig.align = "center",
  fig.width = 7,
  fig.height = 5
)
```

```{r setup, echo = FALSE}
library(innsight)
library(torch)
set.seed(42)
torch_manual_seed(42)
```

## Introduction

The **innsight** package provides two ways to apply gradient-based attribution
methods to neural networks:

1. **Converter-based approach**: Works with models from `torch`, `keras`, and
   `neuralnet`. Converts the model internally and provides access to all
   interpretation methods (LRP, DeepLift, gradients, etc.).

2. **Direct torch approach**: Specialized functions for `torch` models that use
   native torch autograd directly without conversion overhead.

This vignette introduces the **direct torch gradient methods** which are
optimized for `torch` models and provide a streamlined workflow when you only
need gradient-based explanations. Moreover, these gradient-based methods can
be applied to **any torch-based models** and are not limited to sequential 
architectures. They can be used with any model that supports autograd, 
including custom torch models and complex architectures. The only requirement
is that the model's output is differentiable with respect to the input features
and is a single tensor (not a list of tensors).

## Available Methods

The following gradient-based methods are available as direct torch functions:

- `torch_grad()` - Vanilla Gradient and Gradient×Input (-> similar to `run_grad()`)
- `torch_intgrad()` - Integrated Gradients (-> similar to `run_intgrad()`)
- `torch_smoothgrad()` - SmoothGrad and SmoothGrad×Input (-> similar to `run_smoothgrad()`)
- `torch_expgrad()` - Expected Gradients/GradSHAP (-> similar to `run_expgrad()`)

All methods compute feature attributions that help understand which input
features contribute most to the model's predictions.

## Basic Usage

### Vanilla Gradient

The simplest gradient-based method computes the gradient of the output with
respect to the input:

$$\frac{\partial f(x)_j}{\partial x_i}$$

```{r vanilla_gradient}
# Create a simple model
model <- nn_sequential(
  nn_linear(10, 20),
  nn_relu(),
  nn_linear(20, 3)
)

# Generate sample data
data <- torch_randn(5, 10)

# Calculate gradients
gradients <- torch_grad(model, data)

# Result shape: (batch_size, features, outputs)
dim(gradients)
```

The result is a tensor with shape `(batch_size, features, outputs)` where each
element represents the sensitivity of an output to an input feature. For more
details on this method, see the documentation for `run_grad()` which uses the 
same underlying calculations.

**Note:** By default, functions return raw `torch_tensor` objects. For additional
features like plotting and data conversion, use `return_object = TRUE` (see
section "Using Results as innsight Objects").

### Gradient×Input

By setting `times_input = TRUE`, we multiply the gradients by the input values.
This provides a approximated decomposition (first-order Taylor decomposition) 
of the output into feature-wise contributions:

$$x_i \cdot \frac{\partial f(x)_j}{\partial x_i}$$

```{r gradient_times_input}
# Calculate Gradient×Input
grad_times_input <- torch_grad(model, data, times_input = TRUE)

# The sum approximates the output value
output <- model(data)
sum_attributions <- grad_times_input$sum(dim = 2)

# Compare (should be similar)
print(paste("Output:", as.numeric(output[1, 1])))
print(paste("Sum of attributions:", as.numeric(sum_attributions[1, 1])))
```

### Integrated Gradients

Integrated Gradients computes the integral of gradients along a path from a
baseline $x'$ (typically zeros) to the actual input $x$:

$$(x - x') \cdot \int_{\alpha=0}^{1} \frac{\partial f(x' + \alpha (x - x'))}{\partial x} d\alpha$$

```{r integrated_gradients}
# Use zero baseline (default)
int_grads <- torch_intgrad(model, data, n = 50)

# Or specify custom baseline
baseline <- torch_zeros(1, 10)
int_grads_custom <- torch_intgrad(model, data, x_ref = baseline, n = 50)

# The attributions sum to (f(x) - f(x'))
baseline_output <- model(baseline)
output <- model(data)
diff <- output - baseline_output$expand_as(output)
sum_attributions <- int_grads$sum(dim = 2)

# Should be very close
max_diff <- (diff - sum_attributions)$abs()$max()$item()
print(paste("Max difference:", max_diff))
```

The parameter `n` controls the number of interpolation steps. More steps
generally give more accurate results but increase computation time.

### SmoothGrad

SmoothGrad reduces noise in gradient-based explanations by averaging gradients
over multiple noisy versions of the input:

$$\frac{1}{n} \sum_{i=1}^n \frac{\partial f(x + \epsilon_i)}{\partial x}$$

where $\epsilon \sim \mathcal{N}(0, \sigma)$.

```{r smoothgrad}
# Calculate SmoothGrad with 50 noisy samples
smooth_grads <- torch_smoothgrad(
  model, data,
  n = 50,
  noise_level = 0.1  # σ = 0.1 * (max(x) - min(x))
)

# Compare with vanilla gradient
vanilla_grads <- torch_grad(model, data)

# SmoothGrad is typically less noisy
cat(paste("SmoothGrad std:", smooth_grads$std()$item(), "\n"))
cat(paste("Vanilla Gradient std:", vanilla_grads$std()$item()))
```

### Expected Gradients

Expected Gradients (also called GradSHAP) extends Integrated Gradients by
averaging over multiple reference values from a distribution:

$$\mathbb{E}_{x' \sim X', \alpha \sim U(0,1)} \left[ (x - x') \cdot \frac{\partial f(x' + \alpha (x - x'))}{\partial x} \right]$$

```{r expected_gradients}
# Create a reference distribution (e.g., from training data)
reference_data <- torch_randn(100, 10)

# Calculate Expected Gradients
exp_grads <- torch_expgrad(
  model, data,
  data_ref = reference_data,
  n = 50
)

# This provides approximate Shapley values
dim(exp_grads)
```

## Working with Real Data

Let's see a complete example using the Iris dataset:

```{r iris_example}
# Load data
data(iris)

# Prepare data
X <- as.matrix(iris[, 1:4])
y <- as.integer(iris$Species)

# Convert to tensors
X_tensor <- torch_tensor(X, dtype = torch_float())
y_tensor <- torch_tensor(y, dtype = torch_long())

# Create and train a simple model
model <- nn_sequential(
  nn_linear(4, 10),
  nn_relu(),
  nn_linear(10, 3)
)

optimizer <- optim_adam(model$parameters, lr = 0.01)

# Quick training loop
for (epoch in 1:100) {
  optimizer$zero_grad()
  output <- model(X_tensor)
  loss <- nnf_cross_entropy(output, y_tensor)
  loss$backward()
  optimizer$step()
}

# Select samples to explain
samples <- X_tensor[sample(150, 50), ]

# Calculate different attributions
vanilla <- torch_grad(model, samples, output_idx = 1)
grad_input <- torch_grad(model, samples, output_idx = 1, times_input = TRUE)
int_grads <- torch_intgrad(model, samples, output_idx = 1, n = 50)
smooth_grads <- torch_smoothgrad(model, samples, output_idx = 1, n = 50)

# Compare attributions for first sample, class 1
cat("Feature attributions for first sample (Setosa):\n")
cat("Vanilla Gradient:", as.numeric(vanilla[1, , 1]), "\n")
cat("Gradient×Input:  ", as.numeric(grad_input[1, , 1]), "\n")
cat("Integrated Grads:", as.numeric(int_grads[1, , 1]), "\n")
cat("SmoothGrad:      ", as.numeric(smooth_grads[1, , 1]), "\n")
```

## Selecting Output Nodes

By default, all output nodes are computed. You can select specific outputs
with the `output_idx` parameter:

```{r output_selection}
# Calculate gradients for output node 2 only
grads_class2 <- torch_grad(model, samples, output_idx = 2)
dim(grads_class2)  # (3, 4, 1) - only one output

# Calculate for multiple outputs
grads_multi <- torch_grad(model, samples, output_idx = c(1, 3))
dim(grads_multi)  # (3, 4, 2) - two outputs
```

## Data Type Support

All methods support both `float` and `double` precision:

```{r dtype}
# Use double precision for higher accuracy
grads_double <- torch_grad(model, samples, dtype = "double")
grads_float <- torch_grad(model, samples, dtype = "float")

# Check dtype
cat(paste("Is double precision?", grads_double$dtype == torch_double()), "\n")

# Show difference
max_diff <- (grads_double - grads_float)$abs()$max()$item()
cat(paste("Max difference between double and float:", max_diff))
```

## When to Use Which Method?

Use **direct torch methods** (`torch_grad`, etc.) when:

- Working exclusively with `torch` models
- You only need gradient-based explanations
- You want a lightweight, dependency-free approach

Use **converter-based methods** (`run_grad`, `run_lrp`, etc.) when:

- Working with `keras` or `neuralnet` models
- You need other (backpropagation-based) methods like LRP or DeepLift

## Comparison with Converter Methods

The direct torch methods produce identical results to the converter-based
approach but with less overhead:

```{r comparison, eval = FALSE}
# Direct approach
grads_direct <- torch_grad(model, samples, output_idx = 1)

# Converter approach
converter <- Converter$new(model, input_dim = c(4))
grads_converter <- Gradient$new(
  converter,
  as.array(samples),
  output_idx = 1,
  verbose = FALSE
)
result_converter <- grads_converter$get_result("array")

# Results are equivalent (within numerical precision)
max_diff <- max(abs(as.array(grads_direct[,,1]) - result_converter[,,1]))
print(max_diff)  # < 1e-5
```

## Using Results as innsight Objects

By default, the torch gradient functions return raw tensors for maximum
flexibility and minimal overhead. However, you can also get results as
full-featured `innsight` objects that support plotting and other methods.

### Getting Results as Objects

Use `return_object = TRUE` to get an `InterpretingMethod`-compatible result object:

```{r object_results}
# Get result as object
result_obj <- torch_grad(
  model, samples,
  output_idx = 1,
  return_object = TRUE
)

# View summary
print(result_obj)
```

### Available Methods

The returned object inherits from `InterpretingMethod` and provides the same
interface as converter-based methods:

```{r object_methods}
# Get results in different formats
result_array <- get_result(result_obj, "array")
result_tensor <- get_result(result_obj, "torch_tensor")
result_df <- get_result(result_obj, "data.frame")

# View data.frame
head(result_df)
```

### Plotting with Objects

The object interface includes plotting capabilities:

```{r object_plot, fig.width=7, fig.height=4}
# Create plot
plot(result_obj, data_idx = 1, output_idx = 1)

# Plot global
plot_global(result_obj, output_idx = 1)
```


## Summary

The direct torch gradient methods provide an efficient way to compute
gradient-based feature attributions for `torch` models. Key benefits include:

- **Native integration**: Uses torch autograd directly
- **Efficiency**: No conversion overhead
- **Simplicity**: Straightforward function calls
- **Equivalence**: Produces identical results to converter methods
- **Flexibility**: Supports all common gradient-based attribution methods

For comprehensive neural network explanations including non-gradient methods
and advanced visualizations, see the main `innsight` workflow using the
`Converter` class.